kevinkovalchik / RawTools

RawTools is an open-source and freely available package designed to perform scan data parsing and quantification, and quality control analysis of Thermo Orbitrap raw mass spectrometer files from data-dependent acquisition experiments.
Apache License 2.0
64 stars 19 forks source link

1.4.0-beta: exported files are incorrectly placed in inner path #12

Closed microdou closed 5 years ago

microdou commented 5 years ago

Describe the bug In 1.4.0-beta, exported files are incorrectly placed in inner path.

Description of raw file MS2

Command line arguments mono tools/RawTools/RawTools.exe parse -f data/example.raw -p

Command line output Files are generated as expected, but are incorrectly placed in inner path (data/data/example._parse.txt). As you can see, the file path in the argument is used twice. This bug is introduced in 1.4.0-beta.

Desktop (please complete the following information):

kevinkovalchik commented 5 years ago

Thanks for bringing this up! I'll have a look at it. In the meantime, the CLI works fine with absolute paths, so you can use those if the nested folders are too bothersome.

Kevin

On Thu, Dec 6, 2018, 8:48 AM Junjie Wang <notifications@github.com wrote:

Describe the bug In 1.4.0-beta, exported files are incorrectly placed in inner path.

Description of raw file MS2

Command line arguments mono tools/RawTools/RawTools.exe parse -f data/example.raw -p

Command line output Files are generated as expected, but are incorrectly placed in inner path ( data/data/example._parse.txt). As you can see, the file path in the argument is used twice. This bug is introduced in 1.4.0-beta.

Desktop (please complete the following information):

  • OS: Arch Linux

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kevinkovalchik/RawTools/issues/12, or mute the thread https://github.com/notifications/unsubscribe-auth/APT50JnjuQvlijG8rK-qtQ4d0uu6NedBks5u2UpVgaJpZM4ZG5JO .

kevinkovalchik commented 5 years ago

Okay, I think it has been fixed now. Can you confirm the attached application works as you would expect? I'll add this to the beta release if it looks okay to you.

RawTools_PathsFixed_20181210.zip

Kevin

microdou commented 5 years ago

@kevinkovalchik I tested RawTools-1.4.0-beta.4.zip, and the export path has been fixed! Thanks!

kevinkovalchik commented 5 years ago

Great! Thanks for letting me know.

Kevin

On Wed, Jan 2, 2019, 12:43 PM Junjie Wang <notifications@github.com wrote:

@kevinkovalchik https://github.com/kevinkovalchik I tested RawTools-1.4.0-beta.4.zip, and the export path has been fixed! Thanks!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kevinkovalchik/RawTools/issues/12#issuecomment-450979192, or mute the thread https://github.com/notifications/unsubscribe-auth/APT50Lwy3s4x5RjpbycXfpmHHaEyL2wfks5u_RnTgaJpZM4ZG5JO .

microdou commented 5 years ago

Hi @kevinkovalchik , a different question, do you know how to export mass lists with charges of each MS1 scan from raw file?

I'm interested in getting the original charge tags from MS1 scans from raw file (if there is any), but apparently all the masslist exported by 3rd party software doesn't contain charge info (for example mgf files). An alternative is to use Qual Browser or Freestyle. But it's a painful manual procedure for a single scan.

kevinkovalchik commented 5 years ago

I don't know if there are existing tools which will do it, but I could write one pretty quick. Being able to dump out the raw scan data would be a nice feature to add to RawTools anyway. What format would you find easiest to work with? Something like MGF, but with a third column for charge state? There is some other data available for FTMS scans like noise, baseline and resolution if those are also of interest.

Kevin

On Thu, Jan 3, 2019 at 1:39 PM Junjie Wang notifications@github.com wrote:

Hi @kevinkovalchik https://github.com/kevinkovalchik , a different question, do you know how to export mass lists with charges of each MS1 scan from raw file?

I'm interested in getting the original charge tags from MS1 scans from raw file (if there is any), but apparently all the masslist exported by 3rd party software doesn't contain charge info (for example mgf files). An alternative is to use Qual Browser or Freestyle. But it's a painful manual procedure for a single scan.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kevinkovalchik/RawTools/issues/12#issuecomment-451285852, or mute the thread https://github.com/notifications/unsubscribe-auth/APT50PkMBHAfM-34NAWmEkkT6a-_orCZks5u_nh3gaJpZM4ZG5JO .

kevinkovalchik commented 5 years ago

Or are you interested in exporting it for single scans?

microdou commented 5 years ago

I'm interested in all scans :) The richer info the better, so I can choose from for evaluation.

I actually just got what I want using the old MSFileReader and a Python binding (https://github.com/frallain/MSFileReader-Python-bindings).

But I suppose the new raw reader should be more reliable and future proof.

Either MGF, tabular, or plain text is good for me.

I'm actually curious of what kind of info are recorded in each scan. Do you have any documentation for that. I already get a glimpse of the richness of the info from the python binding of old reader.

Thank you!

microdou commented 5 years ago

Just whatever the raw file recorded. No need for post processing.

kevinkovalchik commented 5 years ago

Okay. I'll have a look at this sometime soon and get it into a release.

Yeah, those bindings for MSFileReader are nice. That is what we used in the initial release of RawQuant.

As far as what information is stored in the raw file, there is a lot of it. You might want to check out using RawFileReader in Python to explore it all. I like RawFileReader in Python better than MSFileReader because it is easier to directly access the library instead of using bindings. The MSFileReader bindings might be comparable... I'm not sure if it offers the same coverage of what is in the file.

If you want to try using RawFileReader from Python you will need Python for .NET. I believe you can install it through pip, i.e. pip install pythonnet. It does work in Linux, but getting it installed can sometimes be difficult, at least it was for my system (CentOS 7). The dependencies weren't very well documented, so a lot of digging through error messages was required. In windows it should install with no problems.

After that is done, use the following in a script or console to load RawFileReader into a python session (the clr reference is part of pythonnet):

import clr
import sys
clr.AddReference('ThermoFisher.CommonCore.Data')
# the reference above should point to the appropriate RawFileReader dll file.
# you can also add references to the other RFR libraries if you want to use features from those, like background subtraction.

from ThermoFisher.CommonCore.Data import Business
import ThermoFisher.CommonCore.Data

Then to load a file use rawFile = Business.RawFileReaderFactory.ReadFile("[path to raw file]")

The following archive contains a help file I compiled for RawFileReader (using Sandcastle). It is a compiled HTML help file, and should help you get started.

RawFileReaderHelp.zip

You can also check out this file to see how we implemented RawFileReader in RawQuant, which was written in Python. Note that there is a lot of data in the trailer extras, more than we pulled out of it, so that is worth exploring.

Everything in RawFileReader should be available so long as you have imported the appropriate libraries. The only exception will be the GetInstrumentMethod. It requires a windows COM library to read the method out of the raw file, so it won't work in linux.

I don't remember if objects are automatically converted to Python types when you access them, or if you need to explicitly do it. I do remember that arrays are returned as .NET array objects, and converting them iteratively is super slow. Check out this file from RawQuant if you want to get out lots of arrays (e.g. mass lists): https://github.com/kevinkovalchik/RawQuant/blob/master/RawQuant/RawFileReader/converter.py. It contains a function asNumpyArray which directly casts a .NET array to a numpy array.

Hope that all turns out to be useful!

Kevin

microdou commented 5 years ago

@kevinkovalchik Wow, thank you! The feature is not a pressing need for me now since my Python scripts already sort of works. But it may very well be part of your future development if you envision something interesting. Thanks again for those tips, really helpful for me to begin with.

microdou commented 5 years ago

@kevinkovalchik By the way, one of the reasons why I love the new rawfilereader is that it supports multi-platforms including Linux.