quince-science / QuinCe

QuinCe is an online tool for processing and quality control of data from scientific instruments, with a primary focus on oceanic data.
https://quince.science
GNU General Public License v3.0
7 stars 8 forks source link

RAM usage in ExtractDatasetJob #2964

Open squaregoldfish opened 1 month ago

squaregoldfish commented 1 month ago

On quince.science this job seems to go mad and use up loads of RAM - even more than the specified JVM limit.

squaregoldfish commented 1 month ago

The trouble is that we're loading the contents of every file but not discarding them. There's a loadContents method, but perhaps this should return a FileContents object that can be discarded. Either that or add a discardContents method, although that will rely on the object user to remember to call it.

Theory 3: Keep a static object of the DataFile that has its contents loaded. Then, whenever loadContents is called, call a `static method that unloads the contents of the previous file, and sets this object as the CONTENTS_LOADED file. That way the system will automatically keep only one file's data loaded.

squaregoldfish commented 1 month ago

So that wasn't the issue. Next thought: there's a huge amount of unnecessary database queries made when saving the SensorValues to the database, so eliminate those.

squaregoldfish commented 4 weeks ago

There's a lot of simple calls (like String.replaceAll and my trimString) that end up using regular expressions underneath, and that creates a lot of objects. We might be able to get round that by manipulating the strings directly, although we have to be careful.