info about main .csv report file

pFindStudio / pLink2

pLink is a software dedicated for the analysis of chemically cross-linked proteins or protein complexes using mass spectrometry.

22 stars 1 forks source link

Dear developers, I wanted to know how to read the main .csv file in the 'reports' folder. This file should contain info about all spectra analyzed in the entire file but is not clear which are then used for the cross/loop/monolink csv. Can you explain what these columns represent?

Target_decoy: I see 3 possible values 0-1-2 and the only thing I notice is that value 2 has info then about the proteins involved (even if all lines have a possible peptide/crosslink identified)
q-value: I was thinking that this value was used for FDR but I am not sure
Protein_Type isComplexSatisfied and isFilterIn

furthermore here I find 3 types of score Refined_Score | SVM_Score | Score while the others csv have only the last one: what do the other columns really represent?

Dear @RicZen ,

Please see this wiki about the .csv file format: https://github.com/pFindStudio/pLink2/wiki/CSV-result.

One more word, the .filtered_.csv files contain results satisfy FDR threshold (e.g. <5% FDR), and the first .csv file (no filtered in the file name, in your word the main .csv file) contains all spectra in the entire file, including those not satisfy FDR threshold. For most users, you only need to care about the filtered .csv files.

The first .csv file contains some extra information, such as Target_decoy, q-value, etc. Please see the wiki link above for their explanation. Columns like isComplexSatisfied and isFilterIn are for software debugging purpose, there is no need to understand them.

pFindStudio / pLink2

info about main .csv report file #61