compomics / searchgui

Highly adaptable common interface for proteomics search and de novo engines
http://compomics.github.io/projects/searchgui.html
38 stars 16 forks source link

Adding spectrum file(s) does not recognize .d as valid #370

Closed tomthun closed 4 months ago

tomthun commented 4 months ago

When I select a .d - folder under spectrum file(s) this message pops up.

image

How do i read in .d timstof raw files?

hbarsnes commented 4 months ago

We do not have much experience with timstof data ourselves but if the files can be converted with ProteoWizard (https://proteowizard.sourceforge.io) we should be able to convert them in SearchGUI too (as we are simply using msconvert from ProteoWizard for the conversion). Have you tried converting the files directly in ProteoWizard? Does that work?

Can you also confirm that the given folder actually contains one or more spectrum files? And if so, what is the file format of these files?

tomthun commented 4 months ago

The structure looks like this: image You can also find other data here which doesn't work: data

I've also read that by referencing the location of your ProteoWizard installation I may also provide additional raw file types as input, however even after installing proteoscape msconvert is grayed out and I cannot add the installation path.

Thanks for the fast reply!

hbarsnes commented 4 months ago

The msconvert settings will only be enabled if we detect a supported non-Thermo raw file in the provided spectrum files or folder. (For Thermo raw file we use ThermoRawFileParser instead.)

There were however a couple of raw file formats supported by msconvert that we had not added to our internal list of supported raw file extensions, including Bruker .tdf files, which seems to be the format of your raw data. This has now been corrected. (Note that there is also a .d file in your .d folder but it is empty.)

Furthermore, it seems like msconvert requires the .d folder as input and not the .tdf file directly. Support for this has now also been implemented.

After these changes the conversion from a .d folder to an mzML file seems to be working. It is however extremely slow and I have not yet had the patience to let the conversion complete. This is also the case when using msconvert directly though, hence I do not think there is much we can do with this on our end. Maybe there is something you can do with the msconvert filters to speed it up?

I will do some more testing and release a new version of SearchGUI (hopefully) later today after making sure that the resulting mzML file can actually be used by the search engines. In the meantime you can perhaps do some tests of converting your files directly in msconvert and see if you are able to increase the conversion speed somehow?

hbarsnes commented 4 months ago

SearchGUI v4.3.7 has now been deployed supporting the conversion of .d folders from Bruker instruments. Note that the conversion is still very slow though...

I will close the issue (given that the initial problem has been solved), but do feel free to add more comments, or open a new issue, if you find any solutions to the speed problem.

tomthun commented 4 months ago

Just one more question: Sage now natively works for timsDDA (but not timsDIA yet). Will the .d be conversed per default or only if i select the option for conversion?

hbarsnes commented 4 months ago

In SearchGUI we only support mgf and mzML (did not know that Sage supported timsDDA directly), hence the spectrum files will always be converted if not in either of these two formats. And not all search engines support both formats either. So sometimes we have to convert even if mzML is provided.