snijderlab / stitch

Template-based assembly of proteomics short reads for de novo antibody sequencing and repertoire profiling
MIT License
22 stars 3 forks source link

Implement spectrum loading for other input types #195

Closed douweschulte closed 1 year ago

douweschulte commented 2 years ago

Now the spectrum loading only works with Peaks data. To implement this for other data (or now only Novor) there is a need to specify the raw data files differently. With each input file that has raw data (scan numbers) (so peaks and Novor) there should be the option to specify the file (with Novor) or directory (with peaks) where the raw data can be found. For this the global RawDataDirectory will have to be deprecated and moved to the separate input definitions instead. For now the global one can be used as a fallback for the local one to make sure any batchfiles defined for 1.2.0 will continue to work, even though a deprecated warning will be shown.

Possible programs to support:

douweschulte commented 1 year ago

To support other types of input data take a look at HLXLToolChain in BMS-developers.

douweschulte commented 1 year ago

There is one major problem with mgf though. It does not store the fragmentation method, so it will be hard to provide accurate annotations from these files.

douweschulte commented 1 year ago

There is a minor problem with Novor, the scans that are retrieved (only when using mgf file input) do not seem to match the sequences, very likely the scan numbers have some layer of indirection.