smith-chem-wisc / FlashLFQ

Ultra-fast label-free quantification algorithm for mass-spectrometry proteomics
GNU Lesser General Public License v3.0
19 stars 15 forks source link

multiple ID files #53

Open Dmorgen opened 6 years ago

Dmorgen commented 6 years ago

Hi,

I noted that you write I can use multiple Raw files, but not multiple tsv files? is there an ability to run LFQ for multiple files? if yes, does the program does normalization between files?

Thanks! David.

trishorts commented 6 years ago

if you search multiple raw files together, the results will be in a single tsv file. FlashLFQ will provide an intensity value for each feature in the run where it was identified. If you use the match-between runs (MBR) feature, then FlashLFQ will look in every file for a peak identified in at least one of the raw files.

If you have separate tsv files, then simply combine them offline in excel or notepad++ before running.

Normalization for FlashLFQ is completed but not released. It will be released shortly. I don't know if Rob would give you a copy of the unrealeased version or not. Rob????

Dmorgen commented 6 years ago

Thanks!

rmillikin commented 6 years ago

FlashLFQ does not support multiple identification files, but as @trishorts just mentioned, you can certainly have identifications from multiple spectra files contained in one identification file. The "File Name" column identifies which spectra file the PSM is associated with. So, you will need to append the results of each identification file into one big file.

Normalization is in progress and will be released soon. The code to do all the normalization calculations is written and working, but it depends on the user specifying the condition/biorep/techrep/fraction associated with each spectra file, and this part is not written yet. So, the hard part is done, but there is no way to use it yet. Stay tuned :)

rmillikin commented 4 years ago

Just FYI, the GUI can take in multiple ID files. The command-line version cannot yet but I will add this to a release soon.

Dmorgen commented 4 years ago

That's awesome, thanks!

lucssantos commented 4 years ago

Hello, everyone!

I have been performing some analyses using PeptideShaker outputs into FlashLFQ. However, have performed four different analysis, each one with 5 biological replicates of one condition, against the same protein database. Now, I would like to analyze these four PSM files together with FlashLFQ, mainly because of Bayesian FC analysis.

When combining the these files, as mentioned by @trishorts, do I need to care about duplicated protein access? Because I have most of the proteins shared among the four files.

P.S.: Currently, I'm using FlashLFQ through the Galaxy server so I can't get a single analysis with different ID files as in GUI.

trishorts commented 4 years ago

Hello Lucas, I'll be sure @rmillikin gets this message. I'm sure he'll respond soon.

rmillikin commented 4 years ago

Hi @lucssantos ,

When combining PSM files, FlashLFQ will interpret any protein with the same accession as the same protein. e.g. if you have Accession1 in one file and Accession1 in another file, they're both treated as the same protein; a duplicate will not be created. Hopefully that's what you mean? If not can you clarify?

lucssantos commented 4 years ago

Hello @rmillikin ,

That's exactly what I meant. Thanks for the support!

lucssantos commented 4 years ago

@rmillikin , me again!

When running the concatenated PSM file, I've got the following error in the log file after the match between run analysis started:

* Assertion at mini-exceptions.c:2621, condition gaddr == tls->stack_ovf_guard_base’ not

When I run the files separately, I do not have this kind of issue. Would you know anything about it?

lucssantos commented 4 years ago

@rmillikin , me again!

When running the concatenated PSM file, I've got the following error in the log file after the match between run analysis started:

* Assertion at mini-exceptions.c:2621, condition gaddr == tls->stack_ovf_guard_base’ not

When I run the files separately, I do not have this kind of issue. Would you know anything about it?

I have deactivated the match between run option and the analysis proceeded normally.

rmillikin commented 4 years ago

That error message looks like something from Galaxy. My guess is that FlashLFQ crashed during your analysis and Galaxy reported its own error message. I can ask the Galaxy team what that error message means.

When FlashLFQ crashes, it should generate an error report text file. Does Galaxy give you the ability to view that file? If so, can you post it here?