Closed ypriverol closed 1 year ago
Thanks for the suggestion!
It would indeed be a nice addition to propagate spectrum identifiers to the output format. I think it shouldn't be too difficult, I will have a look.
Hi Yasset,
I have started working on this. Would it be fine for you if we just return the entire title, e.g. id=mzspec:PXD001924:20140106_52_mlplus_tm3:index:10371,sequence=KWDLGDIVAAR/2
Implementation wise, this is a bit easier.
go for it.
I added a command line argument --addSpecIds
, which now adds the spectrum id/title as a fourth column to the clustering output. I will create a new release once all the builds have passed.
@MatthewThe @percolator let me know when the release is done.
Sorry for the delay, I released version 1.04 now which includes the new feature.
Thanks, I will give it a try. !!!
@percolator @MatthewThe
I'm trying to use maracluster to cluster billions of spectra. One problem I found is that we have multiple files, and we would like to use
usi
(https://www.nature.com/articles/s41592-021-01184-6) as identifier of the spectrum in the mgf and then get back the report from maracluster instead that with the index with the usi.This is how a USI looks like in an MGF:
The id will be for this spectrum
mzspec:PXD001924:20140106_52_mlplus_tm3:index:10371
.Do you think you can support this in MaraCluster?