ethanbass / chromConverter

Parsers for chromatography data in R (HPLC-DAD/UV, GC-FID, MS)
https://ethanbass.github.io/chromConverter/
GNU General Public License v3.0
28 stars 3 forks source link

Support for Shimadzu LCD mass spectrometry streams #11

Open silasmellor opened 1 year ago

silasmellor commented 1 year ago

Hi, is it possible to add a parser for Shimadzu .lcd files? I have attached a file example for reference. Best, Silas Anthocyanin_2_MeOH001.zip

ethanbass commented 1 year ago

I will try to look into it, though I can't really make any promises or give you a timeline. It's a little bit tricky since the format is not documented at all, as far as I know. It would be helpful if you could provide a little more information about your file (e.g. what kind of instrument/software it was produced with and what the detectors are on the machine). Thank you for sharing the file! Ethan

silasmellor commented 1 year ago

Hey Ethan, that is all i can ask! The file comes from Shimadzu Lab Solutions v 5.71. The instrument is a shimadzu LC-20 with PDA iirc, but i will double check and let you know.

ethanbass commented 1 year ago

Thanks. Also, you may already know this (and I know it's not as convenient as reading the .lcd files directly), but I believe you can export the raw PDA data from LabSolutions by right clicking the sample name and selecting File Conversion:Convert to ASCII. These ascii txt files do have a parser in chromConverter already.

Also, if you can generate the ascii file for the .lcd file you sent it would be helpful to have as a point of comparison.

silasmellor commented 1 year ago

Hey Ethan, i was actually not aware of that, that sounds excellent! I will try it and upload the ascii file for you. Thanks again, Silas

silasmellor commented 1 year ago

Hey Ethan, here is the corresponding ASCII from LabSolutions. Anthocyanin_2_MeOH001.txt

ethanbass commented 1 year ago

Thanks for providing this file! I don't know if you tried to read it yet using chromConverter, but it actually throws an error because it has commas as the decimal separator (european style) rather than periods (american style). I will put up a patch shortly that can read the european-style files.

ethanbass commented 1 year ago

This should be fixed by 725ddff. You should now be able to read in these ascii files at least:

x<-read_chroms(path_to_file_or_files, format_in="shimadzu_dad")
silasmellor commented 1 year ago

Hey Ethan, yep i did try it with the ascii and got an error, but didnt have time to figure out what it was about. Well, you fixed it before i even got that far so thanks once again!

silasmellor commented 1 year ago

edit:fixed the format_in parameter, still get an error... edit:tried updating from CRAN and github, same thing

Warning in read_chroms(paths = "Anthocyanin_2_MeOH001.txt", format_in = "shimadzu_dad") : Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : scan() expected 'a real', got '0,00000'

I feel i should apologize on behalf of european PCs, for some reason comma is default decimal here even though it means we have to jump through all sorts of hoops with our data files...

ethanbass commented 1 year ago

This is the old error -- it should be fixed in the latest version. Did you refresh your R session after reinstalling the package? Usually the old package is still cached if you install a new version of a package and don't reset your R session. The CRAN version will not be updated for a while, so that one will not work.

I'd suggest:

  1. install new version from github: remotes::install_github("ethanbass/chromConverter"
  2. Reset R session (In R studio, you can do this by selecting Session:Restart R).
  3. Reload the package: library(chromConverter)
  4. Try again: read_chroms("Anthocyanin_2_MeOH001.txt", format_in="shimadzu_dad")
silasmellor commented 1 year ago

Yep, you're right, i had restarted the session before reinstalling from github but didn't think to do so after... works now!

ethanbass commented 1 year ago

Excellent! I'm glad to hear it is working now

captainsailboat commented 1 year ago

Fully willing to do what I can to help with adding a .lcd parser, it would make life a lot easier! Let me know if there is anything I can do to help.

ethanbass commented 1 year ago

Hi @captainsailboat. Do you need a parser for the DAD format or mass spec data? I think there are similarities but the formats aren't exactly the same. I have been working a little bit on the DAD file I got from @silasmellor and have managed to partially decode it but there are still some things I haven't been able to figure out. Do you have any experience decoding binary files? If so, I could try to share with you what I have so far and you could have a look for yourself. Otherwise, it would be helpful to have a few more example .lcd files (along with the corresponding ASCII files) to check the validity of what I've worked out so far.

ethanbass commented 1 year ago

I think I have this mostly figured out. I uploaded the parser to the Shimadzu LCD branch (https://github.com/ethanbass/chromConverter/tree/shimadzu_lcd) if ya'll want to try it out. I haven't figured out yet where the retention times are stored yet in the file but the absorbance values are correct (at least for the file I received from @silasmellor). It is also quite slow, it takes over a minute on my computer to parse one file. You can access the parser (for now) using the read_shimadzu_lcd function.

ethanbass commented 1 year ago

I believe the final retention time should be stored somewhere in these two streams, but I haven't been able to figure out yet how it's encoded.

3D_data_item.txt 2D_data_item_stream.txt

captainsailboat commented 1 year ago

I am struggling to figure out how to utilize this. I have installed R-studio and opened the R file, but am not sure how to actually carry out the conversion. I am also solely using this to analyze the m/z values of fragments at this time so retention times are of little interest to me. If there are instructions somewhere on how to use this script please do guide me to that location. Thank you!

ethanbass commented 1 year ago

Hi @captainsailboat, sorry for the misunderstanding -- the parser i've written so far can only extract the diode array detector stream as I mentioned above, which was the subject of the original issue.

I'm not sure yet how similar the encoding is for the mass spec data. While i'd be interested in trying to figure out the mass spec stream, I have very limited time to work on this and I also don't have a lot of example files to work with.

You might want to check out ProteoWizard (https://proteowizard.sourceforge.io/doc_users.html), I think they support conversions of (some?) .lcd files to mz(X)ML.

captainsailboat commented 1 year ago

Hey again,

I am grateful for your help in trying to add support for these Shimadzu filetypes. The native software that comes with the equipment allows conversion to mzXML. I can provide you with .lcd and .mzXML files for experimentation if needed. However, the conversion of 1 .lcd file to .mzXML yields 2 separate mzXML files with extensions Ev1 and Ev3 which can make analysis a bit frustrating when having to switch back and forth. At the end of the day, I am capable and have access to workstations where I can analyze the data with the native software, though it would be amazing if I could carry out this same analysis from the comfort of my home. Once again, I appreciate you for working on this as an extension of what I can imagine is already a busy life for you.

Cheers.

On Thu, Jul 27, 2023 at 10:06 AM Ethan Bass @.***> wrote:

Hi @captainsailboat https://github.com/captainsailboat, sorry for the misunderstanding -- the parser i've written so far can only extract the diode array detector stream as I mentioned above, which was the subject of the original issue.

I'm not sure yet how similar the encoding is for the mass spec data. While i'd be interested in trying to figure out the mass spec stream, I have very limited time to work on this and I also don't have a lot of example files to work with.

You might want to check out ProteoWizard ( https://proteowizard.sourceforge.io/doc_users.html), I think they support conversions of (some?) .lcd files to mz(X)ML.

— Reply to this email directly, view it on GitHub https://github.com/ethanbass/chromConverter/issues/11#issuecomment-1653696201, or unsubscribe https://github.com/notifications/unsubscribe-auth/A6DUL6R3QLU3JR4IDAJQ5VTXSJYURANCNFSM6AAAAAATVJDPLM . You are receiving this because you were mentioned.Message ID: @.***>

ethanbass commented 1 year ago

That would be helpful if you could send some example files! You should be able to upload them directly here (you may need to change the extensions to something github likes like .txt) or you could email them to me. Could you also describe in a little more detail what kind of data is contained in the two mzXML files and what form you're trying to get the data into for further analysis? If it's a similar instrument to the one discussed in this thread (https://github.com/rietho/IPO/issues/32) it sounds like maybe the two files contain data from positive and negative ionization modes...? Ethan