Nesvilab / FragPipe

A cross-platform proteomics data analysis suite
http://fragpipe.nesvilab.org
Other
195 stars 38 forks source link

Fragpipe and HPC batch system #560

Closed bmoa closed 2 years ago

bmoa commented 2 years ago

First, thank you all for making fragpipe available to the community. Since we rely on batch systems on our HPC clusters, it is hard for our users to every time run fragpipe interactively by running interactive jobs. Since this not optimal, I was wondering whether there is a way for a user to setup the pipeline in fragpipe and generate a batch script (say in bash) that contains all the command that fragpipe will run. Basically, I am thinking about this: 1) User fills all the needed fields and databases/datasets via fragpipe GUI. 2) The user clicks on, say a button called dry-run :-), to generate a script that contains whatever fragpipe commands 3) The user submits the script to HPC in a batch mode without having to look after the gui.

I looked at the logs for the running pipeline under fragpipe, and I can see that it is possible to do but not easy for someone outside of fragpipe team :-). Thanks in advance, Cheers.

fcyu commented 2 years ago

Thanks for your interested in our tools. We are working on a headless version of FragPipe, which can be run in command line environment. It should be ready soon. Stay tuned!

Best,

Fengchao

bmoa commented 2 years ago

Thank you very much Fengchao! We cannot wait to have the headless version of FragPipe on our clusters. Cheers...

tobiasko commented 2 years ago

@fcyu Is there something that interested people should "subscribe" to in order to stay tuned?

fcyu commented 2 years ago

Hi @tobiasko ,

Just need to subscribe this FragPipe repository. I will send out a beta version as soon as it is ready.

Stay tuned,

Fengchao

fcyu commented 2 years ago

Hi all,

The beta version of FragPipe headless is ready. Here (https://www.dropbox.com/s/uyc01xb8ytj998k/FragPipe-17.2-build28.zip?dl=1) is the link to download it. It is actually part of FragPipe. To trigger the headless mode, type fragpipe.bat in Windows or fragpipe in Linux followed by the options:

Running without GUI. Usage:
        Windows: fragpipe.bat --headless --workflow <path to workflow file> --manifest <path to manifest file> --workdir <path to result directory>
        Linux: fragpipe --headless --workflow <path to workflow file> --manifest <path to manifest file> --workdir <path to result directory>
Options:
        -h
        --help                          # Print this help message.
        --headless                      # Running in headless mode.
        --workflow <string>             # Specify path to workflow file.
        --manifest <string>             # Specify path to manifest file.
        --workdir <string>              # Specify the result directory.
        --dry-run                       # (optional) Dry run, not really run FragPipe.
        --ram <integer>                 # (optional) Specify the maximum allowed memory size. Set it to 0 to let FragPipe decide. Default = 0
        --threads <integer>             # (optional) Specify the number of threads. Default = core number - 1
        --config-msfragger <string>     # (optional) specify the location of the MSFragger jar file. If not specified, using the one in the cache.
        --config-philosopher <string>   # (optional) specify the location of the Philosopher binary file. If not specified, using the one in the cache.
        --config-python <string>        # (optional) specify the location of the Python directory. If not specified, using the one in the cache.

For the first time using it, we recommend loading the spectral files in FragPipe GUI, saving the workflow and manifest files. Then, update the database.db-path parameter at the beginning of the workflow file. Run FragPipe headless using the workflow and manifest files.

Please feel free to let us know if you have any questions or find any bugs.

Merry Christmas and happy new year,

Fengchao

bmoa commented 2 years ago

This is amazing Fengchao. Very much appreciated! I downloaded the beta version and I will give it a try. I will keep you posted as soon as I have chance to test it. Cheers! Belaid.

bmoa commented 2 years ago

Hi Fengchao, This is just to let you know that we had chance to try the headless version of fragpipe. The build 4 that you originally sent did not work and outputted a JPanel error. But the build 27 you had on another thread worked nicely. Is there a permanent link/repos where we can get the latest headless version similar to the GUI version? Just to make sure we are up to date with your headless version without bothering you for a dropbox link. Cheers, -Belaid.

fcyu commented 2 years ago

Hi Belaid,

Thank you very much for your feedback. I am glad to hear that the headless mode works for you.

We don't have a permanent place to publish compiled pre-release version. But, if you are interested in, you can compile FragPipe by yourself. Here (https://github.com/Nesvilab/FragPipe/tree/develop) has the latest code. Here (https://github.com/Nesvilab/FragPipe/tree/gh-pages#building-from-scratch) has a brief document about how to compile it.

Best,

Fengchao

bmoa commented 2 years ago

That's excellent Fengchao. I did not know that the headless version is actually part of fragpipe repos. Beautiful! I will follow the instructions to build pre-releases for now. Cheers, -Belaid.

bmoa commented 2 years ago

Hi Fengchao, One of our users is doing a run based on DIA_SpecLib_Quant.workflow but we are consistently getting the following error:

[SEVERE] Could not dispatch event: class com.dmtavt.fragpipe.messages.MessageRun to subscribing class class com.dmtavt.fra gpipe.tabs.TabRun java.util.NoSuchElementException: Sticky note not on the bus: com.dmtavt.fragpipe.messages.NoteConfigSpeclibgen at com.dmtavt.fragpipe.Fragpipe.getStickyStrict(Fragpipe.java:930) at com.dmtavt.fragpipe.FragpipeRun.lambda$configureTaskGraph$55(FragpipeRun.java:1228) at com.dmtavt.fragpipe.FragpipeRun.configureTaskGraph(FragpipeRun.java:1354) at com.dmtavt.fragpipe.FragpipeRun.run(FragpipeRun.java:228) at com.dmtavt.fragpipe.tabs.TabRun.on(TabRun.java:164) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:567) at org.greenrobot.eventbus.EventBus.invokeSubscriber(EventBus.java:517) at org.greenrobot.eventbus.EventBus.invokeSubscriber(EventBus.java:511) at org.greenrobot.eventbus.AsyncPoster.run(AsyncPoster.java:46) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:830) 16:29:33 ERROR - Error delivering events through the bus java.util.NoSuchElementException: Sticky note not on the bus: com.dmtavt.fragpipe.messages.NoteConfigSpeclibgen at com.dmtavt.fragpipe.Fragpipe.getStickyStrict(Fragpipe.java:930) at com.dmtavt.fragpipe.FragpipeRun.lambda$configureTaskGraph$55(FragpipeRun.java:1228) at com.dmtavt.fragpipe.FragpipeRun.configureTaskGraph(FragpipeRun.java:1354) at com.dmtavt.fragpipe.FragpipeRun.run(FragpipeRun.java:228) at com.dmtavt.fragpipe.tabs.TabRun.on(TabRun.java:164) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:567) .... Are we missing something at our end? Cheers, -Belaid.

fcyu commented 2 years ago

Which version are you using?

bmoa commented 2 years ago

Which version are you using?

17.2 build 27 Cheers, -Belaid.

fcyu commented 2 years ago

Hi Belaid,

Can you check if the Python and EasyPQP are well installed and recognized by FragPipe?

Thanks,

Fengchao

bmoa commented 2 years ago

Hi Fengchao, Thanks Fengchao for the quick reply as usual ☺ Yes they are as we used the GUI first to setup the workflow and the manifest files. I just checked and I can confirm that Python (3.8.10) and EasyPQP (0.1.27) are recognized by the GUI of fragpipe 17.2, build 27. Cheers, -Belaid.

From: Fengchao @.> Sent: March 14, 2022 7:55 PM To: Nesvilab/FragPipe @.> Cc: Belaid Moa @.>; Author @.> Subject: Re: [Nesvilab/FragPipe] Fragpipe and HPC batch system (Issue #560)

Notice: This message was sent from outside the University of Victoria email system. Please be cautious with links and sensitive information.

Hi Belaid,

Can you check if the Python and EasyPQP are well installed and recognized by FragPipe?

Thanks,

Fengchao

— Reply to this email directly, view it on GitHubhttps://github.com/Nesvilab/FragPipe/issues/560#issuecomment-1067509478, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHETMYQJ3WT445GAITUVVLDU773X7ANCNFSM5KCG6AEQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you authored the thread.Message ID: @.**@.>>

bmoa commented 2 years ago

Just as follow-up I did a quick testing using an interactive job and I noticed that the first time I run it, fragpipe headless show the error I reported but did not terminate. Killing it and retrying the same command the second time under the same interactive job, and it worked. I tried using two interactive jobs and this behavior was consistent. I then tried using a pure batch job that basically runs two but the same fragpipe headless commands; it kills the first time after 30 seconds and run the second afterwards, and I can confirm that it works. I did not have chance to debug this further but it looks that the first fragpipe command somehow sets up something that fragpipe headless cannot run without. Hopefully this should give you an idea of what’s going on ☺ Cheers, -Belaid.

From: Fengchao @.> Sent: March 14, 2022 7:55 PM To: Nesvilab/FragPipe @.> Cc: Belaid Moa @.>; Author @.> Subject: Re: [Nesvilab/FragPipe] Fragpipe and HPC batch system (Issue #560)

Notice: This message was sent from outside the University of Victoria email system. Please be cautious with links and sensitive information.

Hi Belaid,

Can you check if the Python and EasyPQP are well installed and recognized by FragPipe?

Thanks,

Fengchao

— Reply to this email directly, view it on GitHubhttps://github.com/Nesvilab/FragPipe/issues/560#issuecomment-1067509478, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHETMYQJ3WT445GAITUVVLDU773X7ANCNFSM5KCG6AEQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you authored the thread.Message ID: @.**@.>>

fcyu commented 2 years ago

Hi Belaid,

Thanks for your testing. I found a bug which might be the cause of the error. Can you try the new version from https://www.dropbox.com/s/uyc01xb8ytj998k/FragPipe-17.2-build28.zip?dl=1 ?

Best,

Fengchao

bmoa commented 2 years ago

Hi Fengchao, That build seems to solve the issue so far. Very much appreciated! The run still did not finish yet as I under-estimated the walltime. I just resubmitted the job. I am hoping that the fragpipe is smart enough to restart from where it left of :-) I do not see a restart option in 'fragpipe --help'. Cheers, -Belaid.

fcyu commented 2 years ago

Hi Belaid,

Thank you for the positive feedback!

FragPipe is smart, but not as smart as you expect. If you want to skip the finished steps, you need to "uncheck" (in GUI) or set to false (in the workflow file used in the headless mode) for the corresponding tools.

Best,

Fengchao

bmoa commented 2 years ago

Thanks Fengchao. I believe these are the headless options to check/uncheck for each tool/task: crystalc.run-crystalc=false diann.run-dia-nn=true diaumpire.run-diaumpire=false freequant.run-freequant=false ionquant.run-ionquant=true msbooster.run-msbooster=true msfragger.run-msfragger=true peptide-prophet.run-peptide-prophet=false percolator.run-percolator=true phi-report.run-report=true protein-prophet.run-protein-prophet=true ptmprophet.run-ptmprophet=false ptmshepherd.run-shepherd=false quantitation.run-label-free-quant=false run-psm-validation=true speclibgen.run-speclibgen=true tmtintegrator.dont-run-fq-lq=false tmtintegrator.run-tmtintegrator=false Cheers, -Belaid.

f-huber commented 2 years ago

Hi Fengchao, Thank you very much for this headless version of fragpipe, I tested it (using nonspecific HLA workflow - msbooster included) and it works like a charm. A minor comment though: I will run it on a HPC and therefore I built a singularity container to this headless version of fragpipe. Singularity containers are read-only and when I run fragpipe in the container, I got some error messages at fragpipe launch - but this does not stop fragpipe run the workflow successfully.

I think it would be nice if fragpipe could print a single warning in case of read-only file system instead of the error. I attached the error log for your information (truncated, everything below was looking good). Best regards,

Florian fragpipe_headless_singularity_container.log

fcyu commented 2 years ago

Hi Florian,

I see. It looks like FragPipe couldn't save cache due to the read-only directory. Since it doesn't affect anything else, I think it is ok to keep as it is.

Best,

Fengchao

f-huber commented 2 years ago

Hi Fengchao,

Yes, exact - definitely not a big issue.

Best,

Florian

f-huber commented 2 years ago

Hi Fengchao,

I tried to run fragpipe headless on HPC in a container.

I pasted the log below (the error is due to this read-only file system described above - but it does not prevent fragpipe from running). After this message, fragpipe keeps running but nothing happens.

It runs well if internet is accessible, however if it runs in an environment without internet, it seems to be stuck at the beginning. I noticed that some tools such as philosopher try to connect to internet and check for updates - could it be the problem?

With best regards,

Florian

15:56:33 ERROR - Error initializing fragpipe locations
java.nio.file.FileSystemException: /softwares/fragpipe/lib/../cache: Read-only file system
        at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100)
        at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
        at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116)
        at java.base/sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:389)
        at java.base/java.nio.file.Files.createDirectory(Files.java:690)
        at java.base/java.nio.file.Files.createAndCheckIsDirectory(Files.java:797)
        at java.base/java.nio.file.Files.createDirectories(Files.java:783)
        at com.github.chhh.utils.PathUtils.createDirs(PathUtils.java:82)
        at com.dmtavt.fragpipe.FragpipeLocations$Holder.<clinit>(FragpipeLocations.java:121)
        at com.dmtavt.fragpipe.FragpipeLocations.get(FragpipeLocations.java:183)
        at com.dmtavt.fragpipe.FragpipeLoader.lambda$loadCache$0(FragpipeLoader.java:141)
        at java.base/java.util.concurrent.ForkJoinTask$AdaptedRunnableAction.exec(ForkJoinTask.java:1407)
        at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
        at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
        at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
        at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
        at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)
fcyu commented 2 years ago

Hi Florian,

Your log is still about FragPipe cannot write caches in read-only system, which is not a big deal since it does not affect anything.

Regarding your other errors, can you share the whole log?

Thanks,

Fengchao

f-huber commented 2 years ago

Hi Fengchao,

Yes, that's correct. And this is the problem, there is nothing else in the log (fragpipe does not go further)

Best,

Florian

fcyu commented 2 years ago

Hi Florian,

I tried to reproduce it by unplug my ethernet. There are warnings but FragPipe continues to run without any issue. Attached please see my log.

Best,

Fengchao

log.txt

f-huber commented 2 years ago

Hi Fengchao,

Thanks a lot, I'll try to understand what's going on on my side.

Best,

Florian

f-huber commented 2 years ago

Hi Fengchao,

Sorry, it took me more time than expected to come back to you.

I re-ran fragpipe and it worked well. I am really sorry about this false alert, the problem came from a (not announced) maintenance on the HPC cluster where I was running fragpipe.

However, I noticed an issue in the headless version of fragpipe when database splitting is enabled (this error occured with and without headless mode enabled and inside/outside of the singularity container). I ran the exact same analysis without database splitting and it worked.

The error seems to occure in the msfragger_pep_split.py python script, here is the error:

MSFragger [Work dir: /data/user/fragpipe_test/test_folder/tst_2]
/usr/bin/python3.8 /data/user/fragpipe_test/local_softwares/fragpipe/tools/msfragger_pep_split.py 2 "java -jar -Dfile.encoding=UTF-8 -Xmx109G" /data/user/fragpipe_test/test_folder/MSFragger-3.4/MSFragger-3.4.jar /data/user/fragpipe_test/test_folder/tst_2/fragger.params /data/user/fragpipe_test/test_folder/test_sample-HLAIp_R01.mzML /data/user/fragpipe_test/test_folder/test_sample-HLAIp_R02.mzML
Process 'MSFragger' finished, exit code: 1
Process returned non-zero exit code, stopping
Traceback (most recent call last):
  File "/data/user/fragpipe_test/local_softwares/fragpipe/tools/msfragger_pep_split.py", line 348, in <module>
    def replace_prot_list(sh: bytes, prot_list: list[Prot]) -> bytes:
TypeError: 'type' object is not subscriptable

I also attached the full log if needed.

Thanks again for all your help,

With best regards, log_2022-03-25_13-21-32.txt

Florian

fcyu commented 2 years ago

@guoci , the error was from the split script. Can you take a look?

Thanks,

Fengchao

guoci commented 2 years ago

@f-huber I will fix the error in the next release. For now, can you try to use Anaconda Python (instead of the system installed) to resolve that?

f-huber commented 2 years ago

@guoci Thanks a lot, I'll try this early next week and come back to you. Best, Florian

f-huber commented 2 years ago

Hi @guoci and @fcyu ,

Thanks a lot for your help and solution: running fragpipe with Anaconda solved the database split issue.

However, I'm facing another bug later in the workflow: when it runs ProteinProphet (after msbooster + percolator): WARNING: Cannot find database specified: "samplename.decoys_contaminants.fa".

The problem is that the path to the database in the pep.xml file "lost" the full path, only the filename is stored - which is not the case when fragpipe runs without database spliting.

Here is the path when number of splits is set to 1:

<search_database local_path="/full/path/database.decoys_contaminants.fa" type="AA"/>
<enzymatic_search_constraint enzyme="default" min_number_termini="0" max_num_internal_cleavages="2"/>
<enzymatic_search_constraint enzyme="default2" min_number_termini="0" max_num_internal_cleavages="2"/>
<aminoacid_modification aminoacid="C" massdiff="57.0215" mass="160.0307" variable="N"/>
<aminoacid_modification aminoacid="M" massdiff="15.9949" mass="147.0354" variable="Y"/>
<terminal_modification massdiff="42.0106" protein_terminus="Y" mass="43.0184" terminus="N" variable="Y"/>
<parameter name="# MSFragger.build" value="MSFragger-3.4"/>
<parameter name="database_name" value="/full/path/database.decoys_contaminants.fa"/>

And when it is set > 1:

<search_database local_path="database.decoys_contaminants.fa" type="AA"/>
<enzymatic_search_constraint enzyme="default" min_number_termini="0" max_num_internal_cleavages="2"/>
<enzymatic_search_constraint enzyme="default2" min_number_termini="0" max_num_internal_cleavages="2"/>
<aminoacid_modification aminoacid="C" massdiff="57.0215" mass="160.0307" variable="N"/>
<aminoacid_modification aminoacid="M" massdiff="15.9949" mass="147.0354" variable="Y"/>
<terminal_modification massdiff="42.0106" protein_terminus="Y" mass="43.0184" terminus="N" variable="Y"/>
<parameter name="# MSFragger.build" value="MSFragger-3.4"/>
<parameter name="database_name" value="database.decoys_contaminants.fa"/>

I also attached the detailed log, in case you need it: log_2022-03-28_14-12-22.txt

I also noticed one minor bug (not sure this really is a bug though). When I run the headless mode of fragpipe, I noticed that despite I specifcy the path to python, msfragger and philosopher, fragpipe keeps importing this information from the cache first. I believe that this is not be a problem in most cases, however if files are moved to another place, then fragpipe exits with an error: ERROR - Path does not exist: This issue can be solved by removing all fragpipe's cache data - but I think it might be worth skipping the import from the cache if the path are provided in command line.

Finally, and I am not sure this is related to fragpipe or to the containerization, but I noticed that percolator fails if libgomp is not installed on the system (and works if I install it manually - apt install libgomp1). I don't think I saw this library as a requirement in fragpipe installation manual, you might want to add it. Here is the error, if you need it:

...
Process 'MSBooster' finished, exit code: 0
Percolator [Work dir: /data/test]
/softwares/fragpipe/tools/percolator-305/percolator --only-psms --no-terminate --post-processing-tdc --num-threads 24 --results-psms filename_percolator_target_psms.tsv --decoy-results-psmsfilename_percolator_decoy_psms.tsvfilename_edited.pin
/softwares/fragpipe/tools/percolator-305/percolator: error while loading shared libraries: libgomp.so.1: cannot open shared object file: No such file or directory

Thanks for all your help and best regards,

Florian

fcyu commented 2 years ago

Hi Florian,

Thanks for your feedback. Guo Ci @guoci , can you take a look?

Thanks,

Fengchao

guoci commented 2 years ago

@f-huber please use the updated FragPipe below for the fix to the database path problem. https://drive.google.com/drive/folders/1g1Z_v_eWKhIuuC2wTqu0WahkIAgEBxzi?usp=sharing Will resolve the remaining issues.

f-huber commented 2 years ago

Hi @guoci and @fcyu ,

Thanks a lot for your reply. I am really impressed by how fast you reply guys!

I tested the new version, it works nicely for the database, but now I am facing another issue with philosopher filter. The problem comes from the pep xml import.

I know that I'm using non-standard entry formatting, but it works if I do not use the database splitting and fails with the database splitting.

Here is the filter.log:

time="21:53:47" level=info msg="Executing Filter  v4.1.1"
time="21:53:47" level=info msg="Processing peptide identification files"
time="21:53:48" level=fatal msg="Cannot decode packed binary. XML syntax error on line 35702: invalid XML name: 21031-REF"

Here is the full log of the run (btw, it did not create the log file automatically, I got it from stdout/stderr): fragpipe_dbsplit_8.log

For information, I also noticed that the size of the pepXML files is different when running without database splitting and with database splitting: No splitting:

37M Mar 28 13:56 interact-20210723_CTE-BIO-19519-HLAIp_R01.pep.xml
38M Mar 28 13:56 interact-20210723_CTE-BIO-19519-HLAIp_R02.pep.xml

Splitting:

36M Mar 29 08:07 interact-20210723_CTE-BIO-19519-HLAIp_R01.pep.xml
36M Mar 29 08:07 interact-20210723_CTE-BIO-19519-HLAIp_R02.pep.xml

I guess you might need the pep.xml files, here they are: pepxml.zip

Let me know if I can help with additional logs or files.

Best,

Florian

guoci commented 2 years ago

@f-huber please try the following fix https://drive.google.com/drive/folders/1g1Z_v_eWKhIuuC2wTqu0WahkIAgEBxzi?usp=sharing

f-huber commented 2 years ago

@guoci Thanks a lot, I tried, but I got a different error in the filter (I only tried with database splitting):

time="19:33:47" level=info msg="Executing Filter  v4.1.1"
time="19:33:47" level=info msg="Processing peptide identification files"
time="19:33:49" level=fatal msg="Cannot decode packed binary. XML syntax error on line 46: illegal character code U+0000"

Let me know if you need something else

Thanks again

guoci commented 2 years ago

@prvst can you take a look at this?

f-huber commented 2 years ago

@guoci and @prvst : I tried again and removed the .meta directory generated from the last failed test and it worked. I'll try again and come back to you - sorry if the issue was on my side

prvst commented 2 years ago

Hi @f-huber. The illegal character code shows that your file encoding is not right. Normally this happens when the system locales are configured for a language different from English, sometimes some character is represented using an unsupported glyph, like — and -, for example (both are hyphens, but represented differently). The code U+0000 means a NULL character which can be a problem during the file generation. My suggestion is that you first check the system locale settings, clean and remove all intermediary files, and then try again.

f-huber commented 2 years ago

Hi @prvst and @guoci : Thanks a lot for your answers. Thanks also for the explanation on the U+0000 code, I didn't know. I know that the issue is not due to the language, as I configured all the locales to english. But removing the .meta folder that was created from a previous failed run (one of the test with the previous release) solved the problem. It was therefore likely due to an incomplete intermediary file produced from a previous analysis.

Thanks a lot for all your help and sorry for losing your time on this issue.

Munchic commented 2 years ago

Hello, thank you for making this pre-release! I am testing the linux CLI version on Google Cloud compute and came across an issue during the main search:

In total 431222847 peptides.
Generated 1003167734 modified peptides.
Number of peptides with more than 5000 modification patterns: 0
java.io.IOException: No space left on device
    at java.base/java.io.FileOutputStream.writeBytes(Native Method)
    at java.base/java.io.FileOutputStream.write(FileOutputStream.java:354)
    at java.base/java.io.BufferedOutputStream.write(BufferedOutputStream.java:123)
    at java.base/java.io.DataOutputStream.write(DataOutputStream.java:107)
    at b.a(Unknown Source)
    at n.a(Unknown Source)
    at n.a(Unknown Source)
    at g.a(Unknown Source)
    at k.<init>(Unknown Source)
    at edu.umich.andykong.msfragger.MSFragger.b(Unknown Source)
    at edu.umich.andykong.msfragger.MSFragger.main(Unknown Source)
Process 'MSFragger' finished, exit code: 1
Process returned non-zero exit code, stopping

I have given it 64 CPU, 256 GB RAM, and 1000 GB storage so it shouldn't be an issue to search one HLA II timsTOF file? In the docker container for this, fragpipe_cli and MSFragger .jar both have read/write/execute permissions. I am attaching the full log and workflow files. Let me know if I am doing something wrong, thank you.

log.txt HLA_TIMSTOFscp_ClassII.workflow.txt

fcyu commented 2 years ago

java.io.IOException: No space left on device. There is no free space left in your computer.

Best,

Fengchao

Munchic commented 2 years ago

Thank you, Fengchao. I am running in an VM instance on the cloud with 1000 GB (tried also 2000, 4000 GB) but in any case the same error popped up. I think it might not be due to lack of space?

fcyu commented 2 years ago

Maybe MSFragger didn't have the access to the empty space. I noticed that you are running it in a docker container. You need to troubleshoot your configurations.

Best,

Fengchao

Munchic commented 2 years ago

Thank you, I will double check again and let you know! :)

FloydKu commented 2 years ago

Hi all,

The beta version of FragPipe headless is ready. Here (https://www.dropbox.com/s/uyc01xb8ytj998k/FragPipe-17.2-build28.zip?dl=1) is the link to download it. It is actually part of FragPipe. To trigger the headless mode, type fragpipe.bat in Windows or fragpipe in Linux followed by the options:

Running without GUI. Usage:
        Windows: fragpipe.bat --headless --workflow <path to workflow file> --manifest <path to manifest file> --workdir <path to result directory>
        Linux: fragpipe --headless --workflow <path to workflow file> --manifest <path to manifest file> --workdir <path to result directory>
Options:
        -h
        --help                          # Print this help message.
        --headless                      # Running in headless mode.
        --workflow <string>             # Specify path to workflow file.
        --manifest <string>             # Specify path to manifest file.
        --workdir <string>              # Specify the result directory.
        --dry-run                       # (optional) Dry run, not really run FragPipe.
        --ram <integer>                 # (optional) Specify the maximum allowed memory size. Set it to 0 to let FragPipe decide. Default = 0
        --threads <integer>             # (optional) Specify the number of threads. Default = core number - 1
        --config-msfragger <string>     # (optional) specify the location of the MSFragger jar file. If not specified, using the one in the cache.
        --config-philosopher <string>   # (optional) specify the location of the Philosopher binary file. If not specified, using the one in the cache.
        --config-python <string>        # (optional) specify the location of the Python directory. If not specified, using the one in the cache.

For the first time using it, we recommend loading the spectral files in FragPipe GUI, saving the workflow and manifest files. Then, update the database.db-path parameter at the beginning of the workflow file. Run FragPipe headless using the workflow and manifest files.

Please feel free to let us know if you have any questions or find any bugs.

Merry Christmas and happy new year,

Fengchao hi! the link responses error message 404,how can I download the beta version of FragPipe which can running without gui?

danielgeiszler commented 2 years ago

It’s currently available from the release page.

https://github.com/Nesvilab/FragPipe/releases/tag/18.0

On Jun 2, 2022, at 01:41, FloydKu @.***> wrote:

 Hi all,

The beta version of FragPipe headless is ready. Here (https://www.dropbox.com/s/uyc01xb8ytj998k/FragPipe-17.2-build28.zip?dl=1) is the link to download it. It is actually part of FragPipe. To trigger the headless mode, type fragpipe.bat in Windows or fragpipe in Linux followed by the options:

Running without GUI. Usage: Windows: fragpipe.bat --headless --workflow --manifest --workdir Linux: fragpipe --headless --workflow --manifest --workdir Options: -h --help # Print this help message. --headless # Running in headless mode. --workflow # Specify path to workflow file. --manifest # Specify path to manifest file. --workdir # Specify the result directory. --dry-run # (optional) Dry run, not really run FragPipe. --ram # (optional) Specify the maximum allowed memory size. Set it to 0 to let FragPipe decide. Default = 0 --threads # (optional) Specify the number of threads. Default = core number - 1 --config-msfragger # (optional) specify the location of the MSFragger jar file. If not specified, using the one in the cache. --config-philosopher # (optional) specify the location of the Philosopher binary file. If not specified, using the one in the cache. --config-python # (optional) specify the location of the Python directory. If not specified, using the one in the cache. For the first time using it, we recommend loading the spectral files in FragPipe GUI, saving the workflow and manifest files. Then, update the database.db-path parameter at the beginning of the workflow file. Run FragPipe headless using the workflow and manifest files.

Please feel free to let us know if you have any questions or find any bugs.

Merry Christmas and happy new year,

Fengchao hi! the link responses error message 404,how can I download the beta version of FragPipe which can running without gui?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.

FloydKu commented 2 years ago

thank you,Daniel!

tobiasko commented 2 years ago

I just tested the new --headless on a Linux, Debian. Nice work @fcyu !

tobiasko@fgcz-c-073:~/20220531$ ~/FragPipe-18/fragpipe/bin/fragpipe --headless --workflow LFQ-MBR.workflow --manifest o28463.fp-manifest --workdir headlessOutput
2022-06-02 14:58:24,175 WARN  - Output directory doesn't exist. Creating it.
System OS: Linux, Architecture: amd64
Java Info: 11.0.15, OpenJDK 64-Bit Server VM, Debian

Version info:
FragPipe version 18.0
MSFragger version 3.5
Philosopher version 4.2.2

LCMS files:
  Experiment/Group: RanGAP_AK_3
  - /home/tobiasko/20220531/20220525_007_S387658_RanGAP_AK.raw  DDA
  Experiment/Group: RanGAP_DhaK_1
  - /home/tobiasko/20220531/20220525_003_S387659_RanGAP_DhaK.raw    DDA
  Experiment/Group: RanGAP_wt_2
  - /home/tobiasko/20220531/20220525_005_S387657_RanGAP_wt.raw  DDA

45 commands to execute:
...
Process 'IonQuant' finished, exit code: 0

Please cite:
(Any searches) MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry–based proteomics. Nat Methods 14:513 (2017)
(Any searches) Fast deisotoping algorithm and its implementation in the MSFragger search engine. J. Proteome Res. 20:498 (2021)
(Open search) Identification of modified peptides using localization-aware open search. Nat Commun. 11:4065 (2020)
(Open search) Crystal-C: A Computational Tool for Refinement of Open Search Results. J. Proteome Res. 19.6:2511 (2020)
(Open search) PTM-Shepherd: analysis and summarization of post-translational and chemical modifications from open search results. Mol Cell Proteomics 20:100018 (2020)
(Glyco/labile search) Fast and comprehensive N- and O-glycoproteomics analysis with MSFragger-Glyco. Nat Methods 17:1125 (2020)
(timsTOF PASEF) Fast quantitative analysis of timsTOF PASEF data with MSFragger and IonQuant. Mol Cell Proteomics 19:1575 (2020)
(PSM validation with Percolator) Semi-supervised learning for peptide identification from shotgun proteomics datasets. Nat Methods 4:923 (2007)
(Label-free quantification/SILAC) IonQuant Enables Accurate and Sensitive Label-Free Quantification With FDR-Controlled Match-Between-Runs. Mol Cell Proteomics 20:100077 (2021)
(PeptideProphet/ProteinProphet/PTMProphet/Filtering) Philosopher: a versatile toolkit for shotgun proteomics data analysis. Nat Methods 17:869 (2020)
(TMT-Integrator) Quantitative proteomic landscape of metaplastic breast carcinoma pathological subtypes and their relationship to triple-negative tumors. Nat Commun. 11:1723 (2020)
(DIA-Umpire) DIA-Umpire: comprehensive computational framework for data-independent acquisition proteomics. Nat Methods 12:258 (2015)
(DIA-NN) High sensitivity dia-PASEF proteomics with DIA-NN and FragPipe. bioRxiv doi:10.1101/2021.03.08.434385 (2021)

=============================================================ALL JOBS DONE IN 6.2 MINUTES=============================================================