Open czentgr opened 7 months ago
CC: @kgpai @kagamiori @xiaoxmeng
@czentgr I can look into this if you havent started already.
@kgpai Yes, I can take a look. Trying to make sense of things.
The fact that --minloglevel=1
would exclude INFO messages. In the current code JoinFuzzer.cpp:408
is a LOG(INFO) message which appears in the above snipped but is tagged as an error message. For example, the log contains
E20240405 00:08:00.807075 61 JoinFuzzer.cpp:408] Executing query plan with GROUPED strategy[5 groups]:
while a local run with --minloglevel=1
does not contain the message and with --minloglevel=0
it does
I20240411 15:20:39.405054 8762369 JoinFuzzer.cpp:408] Executing query plan with UNGROUPED strategy[0 groups]:
Note the prefix E
or I
. For the same reason the iteration and seed messages are INFO messages and don't appear in the log output - which is sort of expected because --minloglevel=1
which excludes INFO messages.
I tried on Mac and Linux. Same output for the latest code. --minloglevel=0
would generated much more log but the important logs for repro-ing issues are missing. Am I overlooking something?
Anyway, I will try to repro it by running for a while and hopefully hitting an iteration that is causing the issue.
No repro of the issue so far. I let this run on a Linux VM for the weekend without repro. Each iteration gets a new seed and runs for 100s. Changing it to 900s just like in the test.
If you havent seen it for a bit, then this could have been fixed (inadvertently or otherwise) by some subsequent PR.
Right, I've run now 90+ iterations (900s each). Could be a race somewhere that is just not visible on the machine I'm running it on. As for the seed messages that are missing from the container run log - I propose to make a PR to make them warnings (instead of info) in the Join Fuzzer to make sure they appear in the log. Does that sound ok?
@czentgr That would be great , thank you !
I'm at 300 iterations. No repro. I suppose we should keep this issue open and with the PR, if it occurs again, we can continue the investigation.
Description
The SEMI LEFT JOIN result for the row type column returns NULL value when run with spilling.
The job result: https://github.com/facebookincubator/velox/actions/runs/8562532207/job/23466827795?pr=9372
Excerpt:
The iteration seed is not in the log (logtostderr is turned on, presumably LOG(INFO) messages should show up there?) Due to the missing seed it is not clear how to repro the issue.
Error Reproduction
./velox_join_fuzzer_test \ --seed ${RANDOM} \ --duration_sec $DURATION \ --logtostderr=1 \ --minloglevel=1 \
The problem is that the iteration seed was not logged and $RANDOM is not logged either.
Relevant logs