Using HPC / cluster to use mzmine?

marwa38 commented 2 months ago

Hi team I wonder if you have any advice to the IT team as I will ask them to allow me to use mzmine on the HPC of the university. It is slurm-based here. I will just share your github and ask them to allow me to run your mzmine as they are the one who install, etc. Thanks Marwa

SteffenHeu commented 2 months ago

Hi marwa,

we cannot help you with the specifics of your server, but there are some points to consider:

You can set/limit used memory here: https://mzmine.github.io/mzmine_documentation/performance.html#maximum-memory
The threads parameter in mzmine only limits the concurrent tasks, not hte number of cpu cores used. you can use this parameter: -XX:ActiveProcessorCount=<count> to limit the cpu usage on a server. Add it where the previous option sets the memory
General CLI usage follow this guide: https://mzmine.github.io/mzmine_documentation/commandline_tool.html (a valid user must be available)
CLI arguments are listed here: https://mzmine.github.io/mzmine_documentation/commandline_tool.html#argument-table

ansgarkorf commented 2 months ago

Hi Marwa, did it work? Can we close this?

marwa38 commented 2 months ago

i will let you know before the start of next week hopefully, I requested a access to cluster and I am figuring it out.

Sanchezillana commented 2 months ago

I’m also exploring this. I aim to process over 500 DDA samples, including a library search, using mzmine. Currently, I’m working on a local setup with my PC (i7-13700, 64GB RAM) via the command line, utilizing a .mzbatch file generated through the wizard (which I assume is equivalent to .xml).

I’m curious whether transitioning to the HPC environment will significantly boost performance, considering the differences in hardware. My PC has 24 cores at 4GHz with 64GB RAM, whereas a HPC node I’ll be using has 32 cores at 3GHz and 192GB RAM. However, since the single-core speed of the HPC is slower, I'm wondering if the performance gains will offset that.

From what I understand, in this version of mzmine we’re limited to using a single node on the cluster and cannot set up MPI jobs across multiple nodes. Additionally, I anticipate some challenges in getting everything to run smoothly on the CentOS-based HPC setup. I’ll keep you updated on how it goes.

Sanchezillana commented 1 month ago

So far, I’ve successfully launched jobs via SLURM on the login node, which is interactive and has X11 enabled. This was just for testing purposes, as the login node is quite limited and it doesn't make sense to use HPC resources this way. However, when I attempt to run jobs on the compute nodes, I encounter the following error:

Exception in thread "main" java.lang.UnsatisfiedLinkError: /storage/scratch/lv82/lv82834/mzmine_4_2_0/lib/runtime/lib/libawt_xawt.so: libXtst.so.6: cannot open shared object file: No such file or directory
    at java.base/jdk.internal.loader.NativeLibraries.load(Native Method)
    at java.base/jdk.internal.loader.NativeLibraries$NativeLibraryImpl.open(NativeLibraries.java:331)
    at java.base/jdk.internal.loader.NativeLibraries.loadLibrary(NativeLibraries.java:197)
    at java.base/jdk.internal.loader.NativeLibraries.loadLibrary(NativeLibraries.java:139)
    at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2399)
    at java.base/java.lang.Runtime.load0(Runtime.java:852)
    at java.base/java.lang.System.load(System.java:2030)
    at java.base/jdk.internal.loader.NativeLibraries.load(Native Method)
    at java.base/jdk.internal.loader.NativeLibraries$NativeLibraryImpl.open(NativeLibraries.java:331)
    at java.base/jdk.internal.loader.NativeLibraries.loadLibrary(NativeLibraries.java:197)
    at java.base/jdk.internal.loader.NativeLibraries.loadLibrary(NativeLibraries.java:139)
    at java.base/jdk.internal.loader.NativeLibraries.findFromPaths(NativeLibraries.java:259)
    at java.base/jdk.internal.loader.NativeLibraries.loadLibrary(NativeLibraries.java:249)
    at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2408)
    at java.base/java.lang.Runtime.loadLibrary0(Runtime.java:916)
    at java.base/java.lang.System.loadLibrary(System.java:2068)
    at java.desktop/java.awt.Toolkit$2.run(Toolkit.java:1384)
    at java.desktop/java.awt.Toolkit$2.run(Toolkit.java:1382)
    at java.base/java.security.AccessController.doPrivileged(AccessController.java:319)
    at java.desktop/java.awt.Toolkit.loadLibraries(Toolkit.java:1381)
    at java.desktop/java.awt.Toolkit.initStatic(Toolkit.java:1419)
    at java.desktop/java.awt.Toolkit.<clinit>(Toolkit.java:1393)
    at java.desktop/java.awt.Font.<clinit>(Font.java:288)
    at org.jfree.chart.axis.Axis.<clinit>(Axis.java:97)
    at org.jfree.chart.StandardChartTheme.<init>(StandardChartTheme.java:211)
    at org.jfree.chart.StandardChartTheme.<init>(StandardChartTheme.java:293)
    at io.github.mzmine.gui.chartbasics.chartthemes.EStandardChartTheme.<init>(EStandardChartTheme.java:108)
    at io.github.mzmine.main.impl.MZmineConfigurationImpl.<init>(MZmineConfigurationImpl.java:97)
    at io.github.mzmine.main.ConfigService.<clinit>(ConfigService.java:41)
    at io.github.mzmine.main.MZmineCore.main(MZmineCore.java:155)

It seems to be related to missing X11 libraries (specifically libXtst.so.6). I’m planning to request the installation of these libraries on the compute nodes, but I’m uncertain if this will fully resolve the issue or if there are better alternatives.

Has anyone faced similar issues or have advice on running mzmine in an HPC environment without X11 dependencies? Any insights would be greatly appreciated!

Sanchezillana commented 1 month ago

I’m also exploring this. I aim to process over 500 DDA samples, including a library search, using mzmine. Currently, I’m working on a local setup with my PC (i7-13700, 64GB RAM) via the command line, utilizing a .mzbatch file generated through the wizard (which I assume is equivalent to .xml).

I’m curious whether transitioning to the HPC environment will significantly boost performance, considering the differences in hardware. My PC has 24 cores at 4GHz with 64GB RAM, whereas a HPC node I’ll be using has 32 cores at 3GHz and 192GB RAM. However, since the single-core speed of the HPC is slower, I'm wondering if the performance gains will offset that.

From what I understand, in this version of mzmine we’re limited to using a single node on the cluster and cannot set up MPI jobs across multiple nodes. Additionally, I anticipate some challenges in getting everything to run smoothly on the CentOS-based HPC setup. I’ll keep you updated on how it goes.

I should also mention that my local PC crashes during batch processing of 500 samples, specifically at the 'Local Min Feature Resolver' step (around 50% progress). The error I receive is:

memory issues java.lang.OutOfMemoryError: Java heap space

I’ve tried several approaches to resolve this, including increasing virtual memory, reducing the number of processing threads, and using Process Lasso to assign affinity to P-cores. Unfortunately, none of these solutions have worked so far (see the attached batch file, the same I'm testing in the cluster). batch.zip

Sanchezillana commented 1 month ago

After requesting the installation of the necessary libXtst.so.6 library, I was able to run the program successfully on a high-memory node (1.5 TB of RAM). During batch processing (without library search), mzmine consumed around 300 GB of memory, which seems quite excessive to me for this type of task.

The HPC support team suggested that this might be indicative of a memory leak within the program. Given this significant memory usage, I'm wondering if anyone else has encountered similar issues or has suggestions for optimizing memory consumption in mzmine.

For now, I’ll continue using the HPC cluster, but any insights or advice on how to handle this memory usage would be greatly appreciated.

ansgarkorf commented 1 month ago

Hi, thanks for providing all the important information. It indeed sounds like a memory leak. Are you able to track when the RAM usage goes up? Maybe you can determine which module is responsible for the heavy usage.

ansgarkorf commented 1 month ago

Hi, thanks for providing all the important information. It indeed sounds like a memory leak. Are you able to track when the RAM usage goes up? Maybe you can determine which module is responsible for the heavy usage.

Can you also check in the settings under general "Keep in memory"? It is possible to set it to ALL. Make sure to set it to NONE.

Sanchezillana commented 1 month ago

Hi, thanks for providing all the important information. It indeed sounds like a memory leak. Are you able to track when the RAM usage goes up? Maybe you can determine which module is responsible for the heavy usage.

Can you also check in the settings under general "Keep in memory"? It is possible to set it to ALL. Make sure to set it to NONE.

Yeah, sure it is in none. When I run the in CLI mode I think that is defult also in none. The memory crazyness is always during 'Local Min Feature Resolver'. But it increases progresively during the run.

Sanchezillana commented 1 month ago

Ok, I finally managed to process all 500 DDA samples on a high-memory node (72 cores @ 3GHz, 1.5 TB of RAM). At its peak, mzmine utilized around 1.1 TB of RAM, which is a substantial portion of the node’s capacity. The entire job took approximately 6 hours to complete. Given the complexity and scale of the job, I’m starting to consider that such high memory usage might be expected... However, I'm still curious if there are any known optimizations or configurations that could help reduce memory consumption or improve overall performance. I’m now planning to optimize the processing workflow, starting from this recent successful run. My next steps include conducting library searches and further fine-tuning the processing parameters. I'll keep the community updated on any progress or findings.

2024-09-08 23:17:58 INFO io.github.mzmine.modules.batchmode.BatchTask run Finished a batch of 17 steps 2024-09-08 23:17:58 INFO io.github.mzmine.modules.batchmode.BatchTask printBatchTimes Timing: Whole batch took PT5H59M19.589183695S to finish Step 1: Import MS data took PT57.815294795S to finish Step 2: Chromatogram builder took PT1M36.598184368S to finish Step 3: Smoothing took PT1M10.540471673S to finish Step 4: Local minimum feature resolver took PT2M40.424936179S to finish Step 5: 13C isotope filter (formerly: isotope grouper) took PT40.448704232S to finish Step 6: Isotopic peaks finder took PT9.847988839S to finish Step 7: Join aligner took PT4M46.719240044S to finish Step 8: Feature list rows filter took PT16M58.263264139S to finish Step 9: Peak finder (multithreaded) took PT29M7.158535494S to finish Step 10: Duplicate peak filter took PT2H38M2.077070383S to finish Step 11: Correlation grouping (metaCorrelate) took PT2H9M41.953462506S to finish Step 12: Ion identity networking took PT2.125329372S to finish Step 13: Spectral / Molecular Networking took PT1M27.219474913S to finish Step 14: Export spectral networks to graphml (FBMN/IIMN) took PT3M2.036248481S to finish Step 15: Export molecular networking files (e.g., GNPS, FBMN, IIMN, MetGem) took PT4M26.674665816S to finish Step 16: Export for SIRIUS took PT4M29.417553734S to finish Step 17: Export all annotations to CSV file took PT0.264971242S to finish

ansgarkorf commented 1 month ago

Thanks for all the details. We found something. Can you please try processing again with the latest development release? version 4.2.8

Sanchezillana commented 1 month ago

Hi @ansgarkorf. I built mzmine 4.2.8 with Gradle in IntelliJ in my Windows PC and ran the batch and I have also the memory issue when processing the 500 DDA samples.

Sanchezillana commented 1 month ago

Some feedback @ansgarkorf:

The admins have installed the jdk21 in the HPC and I ran ./gradlew to build the sources. I got:

> Task :help
Welcome to Gradle 8.10.
To run a build, run gradlew <task> ...
To see a list of available tasks, run gradlew tasks
To see more detail about a task, run gradlew help --task <task>
To see a list of command-line options, run gradlew --help
For more detail on using Gradle, see https://docs.gradle.org/8.10/userguide/command_line_interface.html
For troubleshooting, visit https://help.gradle.org/

Deprecated Gradle features were used in this build, making it incompatible with Gradle 9.0.

You can use '--warning-mode all' to show the individual deprecation warnings and determine if they come from your own scripts or plugins.

For more on this, please refer to https://docs.gradle.org/8.10/userguide/command_line_interface.html#sec:command_line_warnings in the Gradle documentation.

BUILD SUCCESSFUL in 1m 30s
11 actionable tasks: 11 executed

But there is not build/jpackage folder. Anyway, the users cannot write in /. Do you know where the package is saved? Can I change this directory?

ansgarkorf commented 1 month ago

Hi, we will provide ready to use builds also for the recent development builds and we are still looking into the potential memory leak.

Sanchezillana commented 1 month ago

Thank you so much!

robinschmid commented 1 month ago

Thanks for the detailed report.

Performance tuning options are described here: https://mzmine.github.io/mzmine_documentation/performance.html

Can you share a representative sample of your 500 batch and the latest batch file?

For a large study I always suggest to be a bit more strict with the filtering early on. Increase the noise levels and minimum height and that a feature needs to be detected in 10 - 20% of your samples. For clinical maybe even more than that depending if unique features in individual samples are of interest or if it is more about solid statistics.
If the data was acquired in multiple batches - import a few samples from each batch and see if there are any retention time shifts. This will make the alignment harder and the number of features will explode.
Even with higher noise thresholds, gap filling will detect lower intensity features if a feature was high enough in those X% of samples.

Profiling a Java application can be hard. You can check out VisualVM or the Profilers in IntelliJ and compare memory dumps / live monitors. If you provide 1TB of RAM mzmine will most likely just slowly fill it all up and and only perform garbage collection if really needed. This is the highest throughput method, and should get you the results fastest. We could check and maybe add an option to run GC after key steps to keep the maximum memory consumption more in check.

In the GUI you can click on the bottom right RAM monitor and it will perform garbage collection. Or in VisualVM you can also trigger GC and see that mzmine does not use all of the memory you think it is. The JVM just holds on to the memory and manages it.

I am interested if you know any other option?

robinschmid commented 1 month ago

Also on the server, did you check that the temp directory is accessible and that mzmine is writing temp files there? With so much memory it could also be faster to just use the RAM and avoid memory mapping but up to your preference.

ansgarkorf commented 1 month ago

Development builds are now available for download. https://github.com/mzmine/mzmine/releases

Sanchezillana commented 1 month ago

Thank you @robinschmid for your feedback.

Can you share a representative sample of your 500 batch and the latest batch file?

I have sent you a pooled QC sample via email. We are currently working on this project, and it is quite confidential at the moment. This is the most recent batch I launched on the HPC, which is currently queued. biomoval_1_vives.zip

For a large study I always suggest to be a bit more strict with the filtering early on. Increase the noise levels and minimum height and that a feature needs to be detected in 10 - 20% of your samples. For clinical maybe even more than that depending if unique features in individual samples are of interest or if it is more about solid statistics.

I am currently using a low signal factor for mass detection in the advanced import. However, I agree that using a factor of 5 for MS1 might be too low. I will optimize this and evaluate the results. It might even work on my desktop PC (64 GB RAM, as mentioned earlier). Right now, it crashes during the chromatogram deconvolution step with a memory issues java.lang.OutOfMemoryError: Java heap space error.

If the data was acquired in multiple batches - import a few samples from each batch and see if there are any retention time shifts. This will make the alignment harder and the number of features will explode.

The data was acquired in a single batch of 500 samples. While it is a large batch, we haven’t noticed significant batch effects so far, although we plan to assess this as part of the project.

Even with higher noise thresholds, gap filling will detect lower intensity features if a feature was high enough in those X% of samples.

Understood. I will experiment with higher noise thresholds in both mass and feature detection.

Profiling a Java application can be hard. You can check out VisualVM or the Profilers in IntelliJ and compare memory dumps / live monitors. If you provide 1TB of RAM mzmine will most likely just slowly fill it all up and and only perform garbage collection if really needed. This is the highest throughput method, and should get you the results fastest. We could check and maybe add an option to run GC after key steps to keep the maximum memory consumption more in check. In the GUI you can click on the bottom right RAM monitor and it will perform garbage collection. Or in VisualVM you can also trigger GC and see that mzmine does not use all of the memory you think it is. The JVM just holds on to the memory and manages it.

I see. GC might indeed be a key factor here. Having an option to trigger GC from the command line would be useful, as currently, it can only be done manually via the GUI, right?

Also on the server, did you check that the temp directory is accessible and that mzmine is writing temp files there? With so much memory it could also be faster to just use the RAM and avoid memory mapping but up to your preference.

Yes, each node has a local HDD configured as temp storage (which is deleted after the job finishes), and I’ve set this directory using the --temp option. I initially enabled memory mapping because I suspected memory limitations (the full batch crashed on a node with a 300 GB memory limit).

robinschmid commented 1 month ago

you can hookup tools like visualvm and others to a running java application and trigger GC. I think a better option would be to add GC after critical steps that produce a lot of garbage.

Usually I would also leave memory mapping enabled. Just an option to try if memory is no issue at all.

robinschmid commented 1 month ago

I looked at your batch and your chromatogram builder and local minimum resolver settings are quite open for noise. Which is ok... but this may explain the high RAM requirements.

In both you could increase the number of scans / data points from 4 to 6 or 7, considering that your chromatographic peak shapes are quite nice.

You may also want to increase the minimum height to 1E5 or at least a little bit to reduce the number of noisy features.

In local min resolver increase the chromatographic threshold to 85%, number of data points to 6, height to 1E5, and maybe minimum search range to 0.04 as your peaks are quite narrow. You can also look into increasing the top/edge ratio to 2 or higher to increase the number of good peaks/noise.

Feature list rows filter increase number of samples by percentage - this will scale with the size of your sample set. You can also set it to process in place which is again a little bit faster.

robinschmid commented 1 month ago

Even with the increased parameters I get 15,000 features in a single sample. So this dataset will be quite complex to process and align. Good thing is that you can increase the minimum height and other constraints and then a feature needs to be only detected in X% of the samples to be still considered. The gap filling will then fill the holes produced by high noise thresholds.

robinschmid commented 1 month ago

You can find the updated batch here but better you check it out yourself in mzmine, trying to find a good balance. biomoval_1_vives.zip

robinschmid commented 1 month ago

In your datafile, this is the scan with the highest noise amplitude and factor 5 already removes most of the noise... but your minimum feature height was at 5E4 while this is directly in the noise region:

Sanchezillana commented 1 month ago

Thank you so much @robinschmid! I'll try the entire batch in the three systems (my desktop, HPC 300 GB RAM, and HPC 1,5 TB RAM).

robinschmid commented 1 month ago

I removed some steps so better update your own batch or at least add the remaining steps again. I hope this improves the performance and otherwise you can increase some of the thresholds a bit more.

robinschmid commented 1 month ago

@Sanchezillana we updated a few things and tweaked the memory options. Memory management was set to best performance and now we are using more of a middle ground. So the GC should also reduce the used memory during the processing if enough space is free after GC.

We also added -login-console for a pure console login.

Check out the latest test build: https://github.com/mzmine/mzmine/releases/tag/Development-release

Also we fixed the build for linux. Previously the installer used the wrong package name. Now it should be mzmine. We updated the readme and documentation. It would be a great help if you can check it out and help optimize it if you have any insights: https://github.com/mzmine/mzmine?tab=readme-ov-file#installation-on-linux

https://mzmine.github.io/mzmine_documentation/getting_started.html#installation-on-linux

https://mzmine.github.io/mzmine_documentation/commandline_tool.html#linux

Sanchezillana commented 1 month ago

@Sanchezillana we updated a few things and tweaked the memory options. Memory management was set to best performance and now we are using more of a middle ground. So the GC should also reduce the used memory during the processing if enough space is free after GC.

We also added -login-console for a pure console login.

Check out the latest test build: https://github.com/mzmine/mzmine/releases/tag/Development-release

Also we fixed the build for linux. Previously the installer used the wrong package name. Now it should be mzmine. We updated the readme and documentation. It would be a great help if you can check it out and help optimize it if you have any insights: https://github.com/mzmine/mzmine?tab=readme-ov-file#installation-on-linux

https://mzmine.github.io/mzmine_documentation/getting_started.html#installation-on-linux

https://mzmine.github.io/mzmine_documentation/commandline_tool.html#linux

Thank you so much, @robinschmid and mzio team! This is fantastic news. I will test out the latest build and the updated documentation over the next week and provide feedback. Your efforts are greatly appreciated!

robinschmid commented 1 month ago

Would be great to know how the new version compares against the old version. I guess you ran the old version with the adjusted batch file?

Sanchezillana commented 1 month ago

Would be great to know how the new version compares against the old version. I guess you ran the old version with the adjusted batch file?

Not yet. Now I am preparing classes and refreshing the Nerst Equation and advanced electrochemistry lol. For the next week I can run the updated batch it in both and compare the performance in my PC (if it won't crash) and in the HPC.

Sanchezillana commented 1 month ago

Hi @robinschmid !

I've tested the batch with all the files using the latest development build (4.3.10), and I'm still encountering the same memory issue on my desktop PC—it crashes due to memory limitations.

Regarding the HPC, I'm unable to compile the software as per the documentation. I think the issue is that, due to admin policies, users cannot write to the root directory (/). We are only permitted to write in specific folders within /home and /storage/scratch. Is there a way to change the directory for the build/jpackage to a location within these writable areas? Anyway, if I understood correctly, using precompiled Java software on the HPC should not significantly impact performance since Java's runtime optimizations handle most of it. That said, for scientific computing, compiling natively for the HPC’s CPU architecture can improve performance, particularly by leveraging parallelization and vectorization, though this is more relevant for languages like C or Fortran.

For the precompiled versions, we're also encountering the libXtst.so.6 library issue. I resolved this by manually copying the attached file (library_hpc_issue.zip) into /lib/runtime/lib. After doing this, the batch file executed successfully. However, the memory usage remains as high as before, necessitating the use of a high-memory node (1.5 TB of RAM) to process my 500 samples.

Sanchezillana commented 1 month ago

Hi @robinschmid and team again!

I ran the batch you optimize including additional steps
batch on the HPC I mentioned earlier (72 cores @ 3GHz, 1.5 TB RAM) and included some commands in the scheduler script to monitor memory consumption using jstat every 10 seconds. Specifically, I tracked the following metrics:

S0, S1, E: These represent the "young generation" memory pools, where new objects are created and promoted if they survive garbage collection. O: The old generation, which stores long-lived objects. M (Metaspace): Holds class-related metadata. Garbage Collection Metrics (YGC, YGCT, FGC, FGCT, GCT): Track the activity and time spent in garbage collection. Additionally, I used ps to extract %MEM and VSZ of the MZmine process.

I’ve attached the MZmine log and the memory log corresponding to this batch. Could you please take a look and share your thoughts on this? At the end, it used more than 830 GB of RAM.

Sanchezillana commented 1 month ago

Please see the updated files. I have relaunched the job with some minor adjustments to the SLURM script to provide clearer headers in the memory log. Let me know what you think!

ram_usage_2760628.log omp_2760628.log

mzmine / mzmine

Using HPC / cluster to use mzmine? #2023