Enable search engines summary

bigbio / pmultiqc

A library for QC report based on MultiQC framework

GNU General Public License v3.0

14 stars 9 forks source link

Enable search engines summary #66

Closed WangHong007 closed 2 years ago

WangHong007 commented 2 years ago

fixes #62 New example is here: report

jpfeuffer commented 2 years ago

Did you see my comment in the issue about the consensus plot?

ypriverol commented 2 years ago

@WangHong007 can you load an example for spectral counting quantms analysis. I would like to see if the new changes proposed by @daichengxin are also seen with the new plots of identifications.

ypriverol commented 2 years ago

Did you see my comment in the issue about the consensus plot?

@WangHong007, @jpfeuffer is talking about this comment https://github.com/bigbio/pmultiqc/issues/62#issuecomment-1195498670

WangHong007 commented 2 years ago

@jpfeuffer You mean count how many search engines identify each PSM. In this case, the number of bars per file is the number of search engines (1,2 or future 3,4...), single search engine will also be plotted. Also, this part does not need idXMLs after consensus, but needs to process idXMLs from multiple engines.

WangHong007 commented 2 years ago

@ypriverol Got it.

jpfeuffer commented 2 years ago

As mentioned before, the "consensus_support" Metavalue should show you that

jpfeuffer commented 2 years ago

https://github.com/OpenMS/OpenMS/blob/develop/src/tests/topp/ConsensusID_4_output.idXML#L19

WangHong007 commented 2 years ago

Just counting the frequency of different consensus_supports, right?

jpfeuffer commented 2 years ago

Yes, maybe multiply by the number of search engines. Because the support is more like a ratio. E.g. 0.5 means it was found by 0.5 of all search engines.

WangHong007 commented 2 years ago

New example is here: report

jpfeuffer commented 2 years ago

Great! I think I made a mistake :D Looks like support * nr_engines does not work. Maybe call it "Consensus across search engines" and just put support there. I can then explain later in the help text what it means.

Otherwise the PR is good to go. Maybe in a later PR we can find a way to limit the histograms to the actual max. of the data but you probably need to parse all files of a search engine first, then create the empty histogram and then fill it.

WangHong007 commented 2 years ago

Final report here: report#60#66