dataChest implentation in measurement code is ineffecient

roboguy222 commented 8 years ago

The way the dataChest is implemented in the measurement code needs to be modified. The data saving is inefficient, thousands of calls are made to the method addData() that can handle all of the data at once. The overhead in that alone would speed things up by quite a bit if I had to guess. The code is also creating multiple hdf5 datasets for a single scan, where as only one .txt and .mat file is created for the same run. I realize that this is probably due to the broken grapher. Lets take a look at this together sometime this week @patzinak.

roboguy222 commented 8 years ago

Fuck, this was sent from @amopremcak, i didnt see Chris had logged in.

patzinak commented 8 years ago

Yes, it would be great to go over this together.

A couple points though.

The improvement of the efficiency here should have less priority than the grapher debugging/improvement. Yes, we are loosing a couple seconds here and there but we are loosing more time in many other places. In particular, the measurement code is not structured efficiently, the servers could be optimized, and so on and so for.
I could have guessed that using multiple addData() calls is not the best idea but as far as I can recall, the manual did not explicitly instruct to avoid this. The fact that you can save several things at once does not mean that you should and, the opposite is true, the fact that you can save the things incrementally does not mean that you should. If I remember correctly, this was the easiest way to save data. We can revise this.
Could you show me the script that saves multiple hdf5 files? This is not intended behavior. I've run some script on the Leiden machine and I saw only one file created. If this is indeed a true bug, I'll fix it.
This issue should have been opened in the measurement repository, not in the server, and should be closed when the manual for dataChest is modified to reflect the optimal approach to data saving.

patzinak commented 7 years ago

This has been addressed in this commit. There is only negligible difference in saving time between different formats now.

McDermott-Group / servers

dataChest implentation in measurement code is ineffecient #88