HPCE / hpce-2017-cw5

1 stars 6 forks source link

edit_distance test crashes 1/5 times when outputs equal #12

Closed natoucs closed 6 years ago

natoucs commented 6 years ago

When running the examples in https://github.com/HPCE/hpce-2017-cw5/issues/7 with edit_distance, I get Outputs are equal. 4 times out of 5. 1 out of 5 times I get:

LogLevel = 2 -> 2 [run_puzzle], 1510598186.04, 2, Created log. [run_puzzle], 1510598186.04, 2, Creating random input ./script/edit_distance_test.sh: line 4: 31352 Floating point exception(core dumped) bin/create_puzzle_input edit_distance 5 2 > w/input.bin Caught exception : StdoutStream::Recv - End of file. Caught exception : StdoutStream::Recv - End of file. LogLevel = 2 -> 2 [execute_puzzle], 1510598186.21, 2, Created log. [execute_puzzle], 1510598186.21, 2, Loading input w/input.bin Caught exception : FileInStream::Recv - Not all data was recieved, m_offset=0, todo=4, errno=0

I know this error could be ignored but I am curious to know how it could be fixed ?

Apparently in file_in_stream.hpp, the Recv function ends because the return from the _read function is 0 or negative (signifying an error). On msdn I see: If the function tries to read at end of file, it returns 0. If fd is invalid, the file is not open for reading, or the file is locked, the invalid parameter handler is invoked, as described in Parameter Validation. If execution is allowed to continue, the function returns –1 and sets errno to EBADF. fd is the 1st input to the read function (read(m_fdRecv, pRead, todo)). However I am not sure how/when fd is invalid 1 times out of 5.

m8pple commented 6 years ago

It looks like the error is actually happening in bin/create_puzzle_input edit_distance 5 2, which is causing a truncated output stream. As you say, if it is input generation then it can be ignored, but it should still be fixed.

I'm not in front of a proper computer right now, so can't easily investigate I'm afraid.

Is there anything special about the scale of the problem generated? The scale you're using looks quite small, so I'm wondering whether there is a bug in the method I used to randomly mutate the string - I vaguely remember it took some hacking to get it working while keeping it reasonably concise, but can't remember how I did it (apparently badly).

Does the probability of crashes go down as the scale goes up?

natoucs commented 6 years ago

I do not get the bug anymore and it seemed to reduce its occurence with a larger scale..