Closed dbdorman closed 5 years ago
Thank you for catching and reporting it. It is a regression bug caused by #352.
PR #361 should fix it. The nightly package should be available in couple of hours. I have tested the new changes with watch -n 0.001 stat outfile.csv
and it seems to be working fine.
$ pip3 install pymoose --user --upgrade --pre
Alternatively you can git pull
and build by yourself.
On large data set, writing to .npy
may cause data corruption. It is very rare but it has happened to me couple of times. I was not able to debug/pinpoint the cause of it. Plain csv
format is recommended.
Thanks for the quick fix. I also had an issue with corrupted .npy file, that I though might have been due to trying to write too many columns because I didn’t see it once I reduced the number of tables in the streamer. I’ll follow your advice and use csv.
@dbdorman Thanks for the pointer about number of columns and data corruption in npy format. The length of header may be computed wrongly (https://github.com/numpy/numpy/blob/067cb067cb17a20422e51da908920a4fbb3ab851/doc/neps/nep-0001-npy-format.rst). I'll take another shot at it.
@dilawar Thanks, I think my header must have exceeded the length allowed by the npy 1.0 format, but the npy 2.0 format allows a longer header length: See this section from the link you mentioned. However it looks like Moose is only using the npy 1.0 format, in this line
@dbdorman After #395 is merged, you should be able to use npy
format with the moose.Streamer
. The streamer ticks every 10 seconds simulation time. At the end of simulation, it appends the leftover data to the file.
If you face any issue, let me know.
moose.Streamer
writes the header, but nothing else, to disk whilemoose.start(simtime)
is running. It does not flush to disk untilmoose.start(simtime)
completes. I tested it withmoose-core/tests/python/test_streamer.py
but changed the simulation time to 5700 seconds so I could monitor whether the file was being written to during the simulation. No data was written (except the header, written onmoose.reinit
) until the python commandmoose.start(simtime)
completed. I tested with both.npy
and.csv
formats and had the same issue with both.Is this the expected behavior? As a workaround I can call
moose.start
repeatedly at shorter intervals to flush to disk.My version of Moose is the current moose-core git master branch as of earlier today, compiled on Fedora, using Python3.