smith-chem-wisc / MetaMorpheus

Proteomics search software with integrated calibration, PTM discovery, bottom-up, top-down and LFQ capabilities
MIT License
91 stars 46 forks source link

running on hpc environment #2250

Open rsalz opened 1 year ago

rsalz commented 1 year ago

Hi, I'm running metamorpheus on my cluster with task toml files. I was wondering if there was a way to add more cores (maybe an extra line in toml file?). Task4 takes several days and the task takes 4 cores when i have 120 to allocate. Also wondering how to add BayesianFoldChange analysis (flashlfq portion) to my run. Do i add a line in my Task5 toml for that? TY

trishorts commented 1 year ago

in windows, the "MaxThreadsToUsePerFile" parameter in the toml file adds more cores. This is line 50 in the latest version. Here, I show 7, which is all my laptop will allow. image

trishorts commented 1 year ago

Running MM in HPC may not accomodate the use of threads in the same way as for windows. I have almost no experience w/ HPC. Some collaborators worked to adapt MM to HPC. You can check on that here: https://github.com/PSTL-UH/HPCMetaMorpheus I don't think that it is up to date.

trishorts commented 1 year ago

Quantification with FlashLFQ occurs by setting the "DoQuantification" setting in the toml to true (Line 18 in the latest release) image

trishorts commented 1 year ago

However, this doesnt do the bayesian thing. You can add a ExperimentalDesign file to the same directory as your raw files and MM will use that in the quant. It allows you to specify conditions/bioreps/fractions/techreps

trishorts commented 1 year ago

Here is an example ExperimentalDesign.tsv file: ExperimentalDesign.zip

rsalz commented 1 year ago

Thanks for the info. I supplied the Experimental design file and it was read in by metamorpheus. Does that mean I will get the Bayesian fold change file automatically or is there no way to attain that without running flashLFQ separately?

In the case I do have to run FlashLFQ separately after metamorpheus finishes: I set Normalize to "true" in my metamorpheus Task5 toml. Should I still set normalization (with --nor argument) in the FlashLFQ thereafter or leave it out?

trishorts commented 1 year ago

MM will do FlashLFQ and provide quantitative protein values. But it will NOT do the bayesian thing. For that, you have to run FlashLFQ separately. I made an issue to add the bayesian thing to MM. I don't think it will be hard. But it is on a list so it may take a few weeks to complete. I'm not sure how well normalization works. I would leave it off for now. You can see how it works in FlashLFQ. That has normalization capabilities. I don't think you need normalized data for bayesian anyway. I think that it takes care of what it needs