Closed tobiasko closed 4 years ago
Hi Tobi,
Thanks for your interest in IonQuant. The paper will be available soon. You will be able to see all the details from the paper.
Best,
Fengchao
On Tue, 17 Mar 2020 at 5:43 AM, Tobias Kockmann notifications@github.com wrote:
Dear FragPipe team,
we (FGCZ) are intensifying our testing of MSFragger for PASEF data. Could you share some info regarding how IonQuant works?
- How do you detect features?
- How are features aligned in which space?
- Are you using an approach similar to "macth-between-runs" (ID migration)?
- In which dimensions is the data recalibrated (I assume the m/z dim. is recalibrated)?
- Is is possible to view features that are detected by IonQuant in some way? Would for instance an XIC/XIM in Bruker data analysis or Skyline see something different than IonQuant?
- Have you checked how Bruker MS data reduction levels affect identification/quantification?
Sorry for the many questions, but since IonQuant is still kind of a black box (no publication) this is the only way ;-)
Thanks, Tobi
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Nesvilab/FragPipe/issues/179, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABU27W6XUR7ASPD7DI777JTRH5A3DANCNFSM4LNHBW5A .
-- Dr. Fengchao Yu University of Michigan
ok. will there be a preprint?
Hi @tobiasko ,
Yes, we just put our manuscript to BioRxiv (https://www.biorxiv.org/content/10.1101/2020.03.19.999334v1).
We also have match-between-runs (MBR) done and will release it soon. Please feel free to contact us if you have any questions.
Best,
Fengchao
WOW! Very nice. Thanks for sharing the manuscript. Should you need any beta testers or additional data let us know.
I finally managed to read your manuscript today! Nice work! In addition, we now have a running FragPipe-like installation on unix incl. IonQuant. Related to this:
"When used with Philosopher summary tables as input, IonQuant adds quantification information
directly to the tables containing validated PSM, peptide, and protein results."
Are the modified files psm.tsv
, peptide.tsv
and protein.tsv
?
*_quant.csv
tobiasko@fgcz-r-033:/scratch/tobiasko/test$ ls -la
total 498948
drwxr-xr-x 4 tobiasko SG_Employees 4096 Apr 2 16:58 .
drwxrwxr-x 6 tobiasko SG_Employees 307 Apr 2 14:22 ..
-rw-r--r-- 1 tobiasko SG_Employees 251863834 Apr 2 14:26 2-21-2020_autoQC4L_444_1_calibrated.mgf
drwxr-xr-x 2 tobiasko SG_Employees 62 Apr 2 16:42 2-21-2020_autoQC4L_444_1.d
-rw-r--r-- 1 tobiasko SG_Employees 162468130 Apr 2 14:25 2-21-2020_autoQC4L_444_1.mzBIN
-rw-r--r-- 1 tobiasko SG_Employees 30618591 Apr 2 14:27 2-21-2020_autoQC4L_444_1.pepXML
-rw-r--r-- 1 tobiasko SG_Employees 6466130 Apr 2 16:58 2-21-2020_autoQC4L_444_1_quant.csv
-rw-r--r-- 1 tobiasko SG_Employees 306775 Apr 2 16:20 delta-mass.html
-rw-rw-r-- 1 tobiasko SG_Employees 38646290 Apr 2 15:44 interact-2-21-2020_autoQC4L_444_1.pep.xml
-rw-r--r-- 1 tobiasko SG_Employees 6482645 Apr 2 15:46 interact.prot.xml
-rw-r--r-- 1 tobiasko SG_Employees 1479118 Apr 2 16:20 ion.tsv
drwxr-xr-x 2 tobiasko SG_Employees 4096 Apr 2 16:20 .meta
-rw-r--r-- 1 tobiasko SG_Employees 267942 Apr 2 16:20 modifications.tsv
-rw-r--r-- 1 tobiasko SG_Employees 1111545 Apr 2 16:20 peptide.tsv
-rw-r--r-- 1 tobiasko SG_Employees 1121670 Apr 2 16:20 protein.fas
-rw-r--r-- 1 tobiasko SG_Employees 559457 Apr 2 16:20 protein.tsv
-rw-r--r-- 1 tobiasko SG_Employees 9490531 Apr 2 16:20 psm.tsv
I am not 100% sure what the features in this table are. Is this the table of identified 4-D features per file?
MSstats.csv
file or reprint-formatted files. Is this normal?Thx for your great support, Tobi
I think the problem is, we would need to write MS1 in mzBIN which will make the file too big? Fengchao, what was your explanation for not doing it?
From: Tobias Kockmann notifications@github.com Sent: Thursday, April 2, 2020 11:49 AM To: Nesvilab/FragPipe FragPipe@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: Re: [Nesvilab/FragPipe] IonQuant (#179)
External Email - Use Caution
I finally managed to read your manuscript today! Nice work! In addition, we now have a running FragPipe-like installation on unix incl. IonQuant. Related to this:
"When used with Philosopher summary tables as input, IonQuant adds quantification information directly to the tables containing validated PSM, peptide, and protein results." Are the modified files psm.tsv, peptide.tsv and protein.tsv?
tobiasko@fgcz-r-033:/scratch/tobiasko/test$ ls -la
total 498948
drwxr-xr-x 4 tobiasko SG_Employees 4096 Apr 2 16:58 .
drwxrwxr-x 6 tobiasko SG_Employees 307 Apr 2 14:22 ..
-rw-r--r-- 1 tobiasko SG_Employees 251863834 Apr 2 14:26 2-21-2020_autoQC4L_444_1_calibrated.mgf
drwxr-xr-x 2 tobiasko SG_Employees 62 Apr 2 16:42 2-21-2020_autoQC4L_444_1.d
-rw-r--r-- 1 tobiasko SG_Employees 162468130 Apr 2 14:25 2-21-2020_autoQC4L_444_1.mzBIN
-rw-r--r-- 1 tobiasko SG_Employees 30618591 Apr 2 14:27 2-21-2020_autoQC4L_444_1.pepXML
-rw-r--r-- 1 tobiasko SG_Employees 6466130 Apr 2 16:58 2-21-2020_autoQC4L_444_1_quant.csv
-rw-r--r-- 1 tobiasko SG_Employees 306775 Apr 2 16:20 delta-mass.html
-rw-rw-r-- 1 tobiasko SG_Employees 38646290 Apr 2 15:44 interact-2-21-2020_autoQC4L_444_1.pep.xml
-rw-r--r-- 1 tobiasko SG_Employees 6482645 Apr 2 15:46 interact.prot.xml
-rw-r--r-- 1 tobiasko SG_Employees 1479118 Apr 2 16:20 ion.tsv
drwxr-xr-x 2 tobiasko SG_Employees 4096 Apr 2 16:20 .meta
-rw-r--r-- 1 tobiasko SG_Employees 267942 Apr 2 16:20 modifications.tsv
-rw-r--r-- 1 tobiasko SG_Employees 1111545 Apr 2 16:20 peptide.tsv
-rw-r--r-- 1 tobiasko SG_Employees 1121670 Apr 2 16:20 protein.fas
-rw-r--r-- 1 tobiasko SG_Employees 559457 Apr 2 16:20 protein.tsv
-rw-r--r-- 1 tobiasko SG_Employees 9490531 Apr 2 16:20 psm.tsv
I am not 100% sure what the features in this table are. Is this the table of identified 4-D features per file?
Thx for your great support, Tobi
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/Nesvilab/FragPipe/issues/179#issuecomment-607928267, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIIMM6ZOPFX4CS26MUMUILDRKSXWLANCNFSM4LNHBW5A.
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
Hi @tobiasko ,
Currently mzBIN only has MS/MS scans. The reasons of not putting MS scans to mzBIN are 1) It would increase mzBIN's size a lot, which would slow down the identification step (MSFragger). 2) I am not sure if putting MS scans to mzBIN would speed up the whole process because reading MS scans from .d doesn't need those fancy pre-processing steps which take most of the time in loading MS/MS scans. That's also why loading .d in IonQuant (MS scan) takes much less time than loading .d in MSFrgger (MS/MS scan).
And yes, IonQuant will update psm.tsv
, ion.tsv
, peptide.tsv
, protein.tsv
, comtined_protein.tsv
(if applicable), and combined_peptie.tsv
(if applicable) by adding intensities.
The *_quant.csv
files contain all quantified PSMs with some measures, such as apex retention time, retention time boundary, apex ion mobility, ion mobility boundary, intensities, and etc. These files are the output of IonQuant at the early stage and the following steps, such as updating Philosopher's tables, dependent on them.
You may need to use Multiple experiment reports
in FragPipe to trigger IonQuant to write MSstats compatible file.
Hopefully my answers are clear enough. Please feel free to contact me if there is any questions.
Best,
Fengchao
Thx! I just used:
java -Xmx32G -jar /usr/local/nesvilab/IonQuant-1.0.0.jar --multidir "multiExpRes" 2-21-2020_autoQC4L_444_1.d 2-21-2020_autoQC4L_444_1.pepXML
and the console reports:
2020-04-02 18:02:10 [INFO] - multidir = /export/data01/tobiasko/test/multiExpRes
but the folder is not created. Does it need to exist already?
I also can not see a modification on the files you mentioned. Any idea why this might happen?
For an multi-experiments like this:
The command would be
java -Xmx32g -jar ionquant.jar --psm exp_1\psm.tsv --psm exp_2\psm.tsv --psm exp_3\psm.tsv --psm exp_4\psm.tsv --multidir ./ F:\data\Bruker\20180819_TIMS2_12-2_AnBr_SA_200ng_HeLa_50cm_120min_100ms_11CT_1_A1_01_2767.d exp_1\20180819_TIMS2_12-2_AnBr_SA_200ng_HeLa_50cm_120min_100ms_11CT_1_A1_01_2767.pepXML F:\data\Bruker\20180819_TIMS2_12-2_AnBr_SA_200ng_HeLa_50cm_120min_100ms_11CT_3_A1_01_2769.d exp_3\20180819_TIMS2_12-2_AnBr_SA_200ng_HeLa_50cm_120min_100ms_11CT_3_A1_01_2769.pepXML F:\data\Bruker\20180819_TIMS2_12-2_AnBr_SA_200ng_HeLa_50cm_120min_100ms_11CT_2_A1_01_2768.d exp_2\20180819_TIMS2_12-2_AnBr_SA_200ng_HeLa_50cm_120min_100ms_11CT_2_A1_01_2768.pepXML F:\data\Bruker\20180819_TIMS2_12-2_AnBr_SA_200ng_HeLa_50cm_120min_100ms_11CT_4_A1_01_2770.d exp_4\20180819_TIMS2_12-2_AnBr_SA_200ng_HeLa_50cm_120min_100ms_11CT_4_A1_01_2770.pepXML
In this example, the exp_1
, exp_2
, exp_3
, and exp_4
folders should be exist and you are in the parent folder (i.e., ./
).
Best,
Fengcao
Ahhh wait...what happens if you only have 1 file? Could that be the reason?
Yes, that's one of the reasons. I am not sure if it make sense to use MSstats with only 1 file. There is not much to normalize and perform differential analysis.
Clear. I am just trying if IonQuant works as expected on a very basic test case. But I should still get the modification of the .tsv files, right?
Yes, you will still get those modified .tsv files as long as you provide --psm
.
@fcyu I am now executing a sh script as suggested by the linux tutorial. This time with 4 *.d folders. It runs through, but I still don't get a multi dir, or the 'MSstats.csv' file. I also checked the protein.tsv
file. All intensity columns are filled with 0
. Could it be that IonQuant has problems with file access rights?
Hi @tobiasko , can you send me your script and the log from IonQuant?
Thanks,
Fengchao
Hi @fcyu Does IonQuant write any log files? Can't see any.
It prints some info to console. If you were using FragPipe, FragPipe would save it to a file. If not, you may need to redirect the printed info to a file.
If you don't have it now, you can send me your command/shell script first.
Best,
Fengchao
I repeated only the IonQuant execution and redirected std out and std error to a file.
tobiasko@Tobiass-MBP:~/Downloads > head ionquant.1.txt
IonQuant version IonQuant-1.0.0
Batmass-IO version 1.17.2
timsdata library version timsdata-2-4-4
(c) University of Michigan
System OS: Linux, Architecture: amd64
Java Info: 1.8.0_242, OpenJDK 64-Bit Server VM, Oracle Corporation
JVM started with 28 GB memory
2020-04-03 17:21:58 [INFO] - Parameters:
2020-04-03 17:21:58 [INFO] - threads = 128
2020-04-03 17:21:58 [INFO] - mztol = 10.0
ionquant.1.txt std error was empty.
The bash script for FragPipe functionality is (had to rename to .txt for upload):
@fcyu I just had the idea to review a log file that was written by FragPipe on Windows. Could it be that the unix script you published is...let's say far away from what FragPipe does (more like a skeleton)? At least I can see some striking difference in how data is organised into folders and how workspaces are handled. What is the logic we have to follow?
Hi @tobiasko ,
You need to provide --psm
and the corresponding psm.tsv
path. Without --psm
flag, IonQuant would quantify all PSMs in the pepXML then stop, because it didn't know where to find the tsv tables.
You may check the example I gave you again.
Best,
Fengchao
For an multi-experiments like this:
The command would be
java -Xmx32g -jar ionquant.jar --psm exp_1\psm.tsv --psm exp_2\psm.tsv --psm exp_3\psm.tsv --psm exp_4\psm.tsv --multidir ./ F:\data\Bruker\20180819_TIMS2_12-2_AnBr_SA_200ng_HeLa_50cm_120min_100ms_11CT_1_A1_01_2767.d exp_1\20180819_TIMS2_12-2_AnBr_SA_200ng_HeLa_50cm_120min_100ms_11CT_1_A1_01_2767.pepXML F:\data\Bruker\20180819_TIMS2_12-2_AnBr_SA_200ng_HeLa_50cm_120min_100ms_11CT_3_A1_01_2769.d exp_3\20180819_TIMS2_12-2_AnBr_SA_200ng_HeLa_50cm_120min_100ms_11CT_3_A1_01_2769.pepXML F:\data\Bruker\20180819_TIMS2_12-2_AnBr_SA_200ng_HeLa_50cm_120min_100ms_11CT_2_A1_01_2768.d exp_2\20180819_TIMS2_12-2_AnBr_SA_200ng_HeLa_50cm_120min_100ms_11CT_2_A1_01_2768.pepXML F:\data\Bruker\20180819_TIMS2_12-2_AnBr_SA_200ng_HeLa_50cm_120min_100ms_11CT_4_A1_01_2770.d exp_4\20180819_TIMS2_12-2_AnBr_SA_200ng_HeLa_50cm_120min_100ms_11CT_4_A1_01_2770.pepXML
In this example, the
exp_1
,exp_2
,exp_3
, andexp_4
folders should be exist and you are in the parent folder (i.e.,./
).Best,
Fengcao
@fcyu I just had the idea to review a log file that was written by FragPipe on Windows. Could it be that the unix script you published is...let's say far away from what FragPipe does (more like a skeleton)? At least I can see some striking difference in how data is organised into folders and how workspaces are handled. What is the logic we have to follow?
The logic is that each experiment
has it own folder. In the folder there are pepXML and tsv tables from Philosopher. In running IonQuant, one --psm
flag indicates one experiment's psm.tsv
's path. So, there will be multiple --psm
flags in multi-experiments case. Finally, the --multidir
flag indicates the parent folder of all experiments.
Best,
Fengchao
Hi @fcyu,
wait...do it understand you correctly? The LC-MS table in the FragPipe GUI is represented by a folder structure on the linux command line? So given the table would look like:
file1, expA, 1
file2, expB, 2
I would need to generate folders named expA_1
and expB_2
and place file1
and file2
respectively? Do I have to use _
as a separator? Would placing multiple files in the same folder be treated as a fraction or a tech. replicate (repeated injection)?
Hi @tobiasko ,
Yes, FragPipe generates folders and puts files into the corresponding folders according to the LC-MS table.
You need to have the tsv tables and the pepXML files in their corresponding folders. The spectral files (e.g., mzML, mzXML, or .d) can be in somewhere else. But need to provide the paths to IonQuant.
Yes, you need to use _
as a separator.
Multiple spectral files in the same Experiment
and Replicate
would be treated as fractions. You may find a detailed explanation here (https://github.com/Nesvilab/MSFragger/blob/master/tutorial_fragpipe.md#for-reports-with-results-from-different-fractionated-replicates-shown-in-separate-columns).
Best,
Fengchao
Ok! And all philosopher commands need to be executed in every folder?
Yes, and you need to run Philosopher proteinprophet
with all interact-.pep.xml files and generate a combined interact-.prot.xml. You also need to run Philosopher abacus
to generate combined tsv files.
I suggest you borrowing the commands generated by FragPipe by clicking Dry Run
.
Best,
Fengchao
Tobi
This is why we developed FragPipe -to generate scripts for execution of commands. FragPipe runs on Linux too. As Fengchao said, you can use it to see what commands need to be executed. You can then modify it.
Best Alexey
Sent from my iPhone
On Apr 5, 2020, at 3:48 PM, Fengchao notifications@github.com wrote:
External Email - Use Caution
Yes, and you need to run Philosopher proteinprophet with all interact-.pep.xml files and generate a combined interact-.prot.xml. You also need to run Philosopher abacus to generate combined tsv files.
I suggest you borrowing the commands generated by FragPipe by clicking Dry Run.
Best,
Fengchao
— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/Nesvilab/FragPipe/issues/179#issuecomment-609472092, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIIMM647QMHIZYLCOZ5XA7DRLDOBVANCNFSM4LNHBW5A.
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
Hi @anesvi,
I think the dry run on Linux is very nice idea. Did it yesterday for a closed search and we are now modifying our bash script accordingly.
Thx for the tip, Tobi
Hi @fcyu,
I checked the listing for a dry run of a closed search and most is clear. What I still don't fully understand is the PeptideProphet section. Why are you using tmp DIRs here? Are they created by an upstream process or is there some kind of util function that isn't logged?
PeptideProphet: Workspace init [Work dir: /scratch/tobiasko/fragpipe_test/FragPipe_output/expA_1/fragpipe-3-2-2020_11-25-33_autoQC01_463_1_Slot1-54.pepXML-temp]
/usr/local/nesvilab/philosopher-3.2.3/philosopher workspace --init
PeptideProphet: Workspace init [Work dir: /scratch/tobiasko/fragpipe_test/FragPipe_output/expA_2/fragpipe-3-3-2020_09-19-55_autoQC01_470_1_Slot1-54.pepXML-temp]
/usr/local/nesvilab/philosopher-3.2.3/philosopher workspace --init
PeptideProphet: Workspace init [Work dir: /scratch/tobiasko/fragpipe_test/FragPipe_output/expB_3/fragpipe-3-3-2020_10-29-26_autoQC01_471_2_Slot1-54.pepXML-temp]
/usr/local/nesvilab/philosopher-3.2.3/philosopher workspace --init
PeptideProphet: Workspace init [Work dir: /scratch/tobiasko/fragpipe_test/FragPipe_output/expB_4/fragpipe-3-3-2020_11-37-59_autoQC01_472_3_Slot1-54.pepXML-temp]
/usr/local/nesvilab/philosopher-3.2.3/philosopher workspace --init
Greetings, Tobi
@tobiasko Peptide prophet is single threaded, separating stuff out into different folders allows fragpipe to run multiple instances of peptide prophet at once, speeding up the process. This is the only reason.
Another question... sorry! I am not familiar with Philosopher, but the pipeline concept looks really attractive to me. Have you tried moving parts of a FragPipe-like linux workflow (a closed search) into a Philosopher pipeline? Pros and Cons versus a pure bash script?
Hi @tobiasko ,
Don't need to be sorry. I personally prefer shell script because I have the full control of the commands and know exactly what commands are going to run. But Philosopher Pipeline may be easier to run and maintain. Unfortunately, I never used it. Alexey @anesvi and Felipe @prvst are better persons than me answer this question.
Best,
Fengchao
@tobiasko running a full analysis via pipeline mode is a more robust solution than relying on bash scripts. It is also ideal to maintain reproducibility, specially if you are aiming for having a publication. You will need only one configuration file for every program and every analysis step, won't need to bother with command sintaxes and won't need to bother with file names and paths. The pipeline can be triggered with one command, you can find an example here.
@prvst That sounds pretty convincing! I am working my way through the example. If I need additional modules not included in the example (like IonQuant) how would I add them?
the current version works with Comet, MSFragger, the Prophets and TMT-Integrator. Any other tool that you want to add in will need to be executed manually
IMPRESSIVE! I just executed a complete spectral counting workflow with PASEF data (4 files)...using a single command.
Dear FragPipe team,
we (FGCZ) are intensifying our testing of MSFragger for PASEF data. Could you share some info regarding how IonQuant works?
Sorry for the many questions, but since IonQuant is still kind of a black box (no publication) this is the only way ;-)
Thanks, Tobi