Raw file reader/converter

hroest commented 8 years ago

I think providing a raw file reader would be a major step for the project and I know that on different mailing lists, multiple people have reported various levels of success in doing that. However, I think the BioDocker project would be the best framework to report, discuss and achieve progress on this goal. Basically it seems there are three ways to achieve this

run msconvert in a wine environment
reverse-engineer the vendor file formats
convince the vendors to provide native Linux libraries

Obviously (3) would be the optimal outcome for all people on Linux but in the meantime I think we have to focus on (2) and (1). I remember a conversation on the mailing list where partial success using wine was reported: https://sourceforge.net/p/proteowizard/mailman/proteowizard-developer/thread/563A64B3.3070100@gmail.com/ of course this would be nicer to have inside a reproducible Docker container, maybe somebody can have a look at that? It seems that the https://github.com/sneumann/pwiz-appliance project is making great progress here but the application is not fully automated while another project here https://github.com/jmchilton/proteomics-wine-env has a similar goal and even provides a script that may be used to generate a docker container but contains some manual steps as well.

I think especially with the comments of the MCP reviewers, having such a container would be a great boost for the project and would its impact much more clear to people outside the project.

ypriverol commented 8 years ago

@hroest We know all these initiatives. I was in contact with the people of ProteoWizard in person to move forward this containers and they explain me that wine implementation produce a lot of errors during the conversion. It was actually one of our aims from the very beginning and Felipe and myself spend some time trying to implement a solution. For know I will do two things: 1- I will be in contact with Termo developers to put their library online using BioContainers 2- I will contact someone from ProteoWizard to re-start this discussion and see if we can arrive to a solution. 3- I will open a new containers request related with proteowizard.

As a side thing, I really think that MCP is not the right target audience for our community. We have more than 30 containers that can be easily use and deploy in any infrastructure apart of MaxQuant and Skyline. We need to move more in the Proteomics community the dependency from Windows.

hroest commented 8 years ago

@ypriverol this sounds like a great idea to go forward, already if we had Thermo-only support I think this already would be great. I think without doing some tests ourselves we cannot really determine whether Wine will work or not, we will have to see for ourselves. Also we might want to talk to Steffen and John to get their experiences with their respective approaches

ypriverol commented 8 years ago

@hroest I have been talking with John about this in the past. I will try to open a discussion here with @paragmallick @jmchilton @sneumann

prvst commented 8 years ago

As @ypriverol mentioned before, we spent some time on this in the past, way before BioDocker get some traction. Based on my experience it's technically and possibly legally impossible, let me elaborate:

Technical issues

The goal here is simple, to run a Windows program in a Linux environment using compatibility layers like Wine. ProteoWizard provides the most prominent library to be deployed into production. Also, as everyone knows, we depend on old black boxes, coded a long time ago in a galaxy far, far away... If I'm not mistaken, Thermo library is written in Basic, and this is one of the reasons we have problems.

In order to have an environment configured, we need to install old Microsoft libraries like vcrun2005 and above, and old dotnet like dotnet20, 30 sp1, sp2, etc....

Doing that in Wine is not straight forward since it depends on a 32bit architecture. Some of the libraries mentioned above, like dotnet20 need also to be installed via GUI, something that can also be a problem, specially when dealing with a container.

Are there any tutorials ? / Do they Work ?

Yes, old ones:

SPCTools / No

princelab / No

Are there any working implementation ?

There are some attempts, new ones: btw, none work!

phnmnl (this one is promissing)

sneumann

jmchilton

In the end, all end up having more or less the same errors. My best attempt ended up with a mzML/ mzXML file partially converted, only the header is converted, when the first peaks starts to be processed, some low level errors, about 'thread apartments' show up, and the process freezes.

Are there any working solution ?

Surprisingly, I think there is. I remember tho have seen a long time ago, in a forum discussion, @jke000 saying that he had a linux virtual machine perfectly configured to run wine and msconvert, still working. ( needs to be checked, @jke000 ).

Legal issues, you say ?

Yes, there may be legal issues when creating and sharing containers with msconvert and Thermo libraries. Technically, every container, instantiated from an image will break Thermo and possible others licenses. This need to be discussed further.

Now, this is the fun part...

Are there native implementations for converting RAW files ?

Yes !! in fact, there are two:

The first one is the really impressive work from Gene Selkov (@selkovjr), who was able to scrub some bits and bytes, and detailed in an impressive documentation the internals from Thermo's RAW files.

Mr. Selkov implemented his knowledge into a Perl module and software called Finnigan.

The sad part is that Mr. Selkov told me once by e-mail that he was not maintaining that anymore.

The second one is really new and fresh, specially interesting to me, since I'm investing a lot lately on Go language.

proteininspector

This piece was nicely done by Pieter Kelchtermans (@pkelchte) , Lennart Martens (@lnnrt) and collaborators. I actually found this one months before the publication Open-Source, Platform-Independent Library and Online Scripting Environment for Accessing Thermo Scientific RAW Files

It was based on Mr. Selkov work, although it does not convert the files it is a starting point, as it is able to read from RAW files natively with a statically linked binary, that is platform-independent.

Concluding remarks

As you can see, there are promising efforts, but no option good enough to be deployed and used in production. I have been studying and working on this issue for a long time, as you can see, and I'm willing to move this forward.

If I left any important detail out of this post, please, feel free to add on.

sneumann commented 8 years ago

Hi, just for some history, the sneumann/pwiz-appliance was there before I got to know docker :-) So I took that forward into phnmnl/docker-pwiz. We're currently discussion how to put an automatic testing around our phnmnl containers (any hints or best practices ?!) that are more granular than "builds/failsToBuild". docker-pwiz with msconvert inside wine already works for Bruker files.

ypriverol commented 8 years ago

@sneumann thanks for your comments in BioContainers. We have been talking with Pablo in Phonomenal about best practices and we have a couple of issues open to discuss future ideas. You are more than welcome to contribute with BioContainers. Regarding phnmnl/docker-pwiz I would like to make some questions: 1- It is fully functional? Which RAW files have you tested? We will more than happy to stress this containers in the BioContainers community with multiple tests from different groups. Please let us know your ideas. This container is really important for other containers in BioContainers such as GalaxyP, OpenMS, etc.

sneumann commented 8 years ago

I am afraid Thermo .raw is one of the pain points, I opened an issue at Wine: https://bugs.winehq.org/show_bug.cgi?id=41124

hroest commented 8 years ago

wow, this is really a great discussion going on here and there are so many aspects to consider. Thanks a lot to @prvst for writing up a summary of the currents state of the art. I really hoped that my comment will get a discussion going and so it did! It seems that at the moment for wine, @sneumann has the most advanced solution while @pkelchte has the best reverse-engineered library (I think pursuing both options is sensible at this point).

@sneumann can you comment on which vendors currently work? all apart from Thermo? or just Bruker which is documented on your github page? What about AB Sciex?

@pkelchte is the unthermo library in a state where it could write out mzML / mzXML ? are you still developing it? and, obviously, is there a docker container for it :-) ?

ypriverol commented 8 years ago

I will add to this @hroest comment. Do we have any way to help @sneumann @pkelchte. We can look for PRIDE raw data in massively test the conversions. If you agree with that @sneumann I can do the test.

pkelchte commented 8 years ago

Hi @hroest, long time no see! The bad news is that I am currently no longer maintaining unthermo. As much as I loved writing this little library, I simply don't have the resources anymore after switching jobs.

The good news however is that it works pretty well, and the whole library is only around a 1000 lines of code. In short: check out the github repository; it's super easy to fork and improve.

There is purposefully no xml code in there. If you look at the example apps (see the unthermo wiki), I just focused on getting data out. The Go library was the start of a bigger toolkit that wouldn't need intermediary conversion steps.

If you really need mzML formatted files, I would encourage you to write a simple xml marshaller in Go. That would only take a few hours.

Good luck, and I'd be happy to help you out if you have more questions!

prvst commented 8 years ago

@pkelchte

If you really need mzML formatted files, I would encourage you to write a simple xml marshaller in Go. That would only take a few hours.

I have one, just not sure yet how to integrate with the rest of the library. Actually I have marshaller/unmarshaller libraries for mzML, mzXML, pepXML and protXML files

ypriverol commented 8 years ago

Hi @hroest @prvst if this code is not maintained anymore I would suggest to invest time in the @sneumann solution. If we can't find a solution then we can think about propose a complete solution and not partial solutions.

pkelchte commented 8 years ago

That's great! What is your Go internal representation of these files?

You would basically create your objects from unthermo objects and then call the marshaller.

Unthermo uses the parent library "ms" objects: a Scan struct that contains a Peak array called Spectrum.

In the xic example you see the scans getting read from the file one by one. Its Spectrum is iterated over and peaks are printed.

In your case, instead of printing, you would add peaks to your Go mzML object. When all peaks are handled, you call your marshaller.

paragmallick commented 8 years ago

Hi All,

Thanks so much for looking into this! I wish I had a better and more useful suggestion.

However - two points:

1) We have been working on a web/cloud-based workflow thing that might be useful for a subset of people (mostly the biology community). It’s called SpellBook and will be released shortly. It does rely extensively on dockers. However, we still rely on windows for conversion =(

2) There is also this emerging initiative http://www.allotrope.org that is meant to ultimately create an open container that all the vendors agree to. It may be worth paying attention to and bringing this non-windows perspective to.

Best, ~ Parag M ~

On Sep 29, 2016, at 5:10 PM, Pieter Kelchtermans notifications@github.com wrote:

That's great! What is your Go internal representation of these files?

You would basically create your objects from unthermo objects and then call the marshaller.

Unthermo uses the parent library "ms" objects: a Scan struct that contains a Peak array called Spectrum.

In the xic example you see the scans getting read from the file one by one. Its Spectrum is iterated over and peaks are printed.

In your case, instead of printing, you would add peaks to your Go mzML object. When all peaks are handled, you call your marshaller.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

ypriverol commented 8 years ago

Hi @paragmallick Great to hear that about this project. Would you like to participate in BioContainers by adding some of your containers or using Biocontainers.?

About the conversion, I remember you mention before that some of your collaborators has been woking in this issue before. Anyone that can help in this direction.

Regards Yasset

hroest commented 8 years ago

Hi @paragmallick that sounds great, so what is your current plan for people to use SpellBook, can they upload .raw and you guys use Windows in the cloud to convert or do people have to upload mzML? Have you guys thought about the possibilities mentioned here in the discussion to use native Linux?

@pkelchte indeed, we should meet up again especially as you are in SF now !! I remember from the HUPO last week that Lennart Martens said that they internally use the library to access data (I am not sure whether he specifically said that they have a mzML converter) but it seems like they are using it a lot and I would therefore not call it unsupported. If the Martens group is using it then I am sure it will continue working, but maybe someone (e.g. @pkelchte ) could inquire and give us more information about the current status and future of the project. Also, it looks like there is already a MGF writer in the tools part of the repo unthermo/tools/labelq.go so that may be a start for creating a mzML writer. Unfortunately I dont know any Go...

@ypriverol given that this project seems to be the best shot we have at native reading of raw files, I dont think we should abandon it.

prvst commented 8 years ago

hey @pkelchte , check this:

<mzXML xmlns="">
   <msRun scanCount="18949" startTime="" endTime="">
     <parentFile fileName="" fileType="" fileSha1=""></parentFile>
     <msInstrument msInstrumentID="0">
       <msManufacturer category="" value=""></msManufacturer>
       <msModel category="" value=""></msModel>
       <msIonisation category="" value=""></msIonisation>
       <msMassAnalyzer category="" value=""></msMassAnalyzer>
       <msDetector category="" value=""></msDetector>
       <software type="" name="" version=""></software>
     </msInstrument>
     <dataProcessing centroided="0">
       <processingOperation name=""></processingOperation>
     </dataProcessing>
   </msRun>
 </mzXML>

This is the result of your library + my mzXML unmarshaller library. It contains only scanCount. How can I get the rest o from the header?

I' m starting with the tag msRun, so in my library it looks something like this:

// MSRun tag
type MSRun struct {
XMLName        xml.Name       `xml:"msRun"`
ScanCount      int            `xml:"scanCount,attr"`
StartTime      []byte         `xml:"startTime,attr"`
EndTime        []byte         `xml:"endTime,attr"`
ParentFile     ParentFile     `xml:"parentFile"`
MSInstrument   MSInstrument   `xml:"msInstrument"`
DataProcessing DataProcessing `xml:"dataProcessing"`
Scan           []Scan         `xml:"scan"`  
}

this is getting interesting...

prvst commented 8 years ago

0d6513e271d85431053d8786dad07be5

@ypriverol @hroest @pkelchte its working !!! I'm writting mzXML files using a Thermo RAW data. We just need to figure out how to get the remaining information to make the file complete

pkelchte commented 8 years ago

@prvst that is awesome! Nice work!!

I know that some information is not in the ms Scan object because it was a little harder to extract from the Thermo files. See this page for everything that is currently included.

https://godoc.org/bitbucket.org/proteinspector/ms#Scan

What data are you looking for? I contacted Lennart and cc'd Hannes asking to connect you guys to one of Compomics developers.

If you want to start experimenting yourself, look for example at the info variable in the Open function: https://bitbucket.org/proteinspector/ms/src/103bbbafd87a4e30bec3f6c68c7d1f7b99019a9f/unthermo/reader.go?at=master&fileviewer=file-view-default#reader.go-28

There might be a lot of untapped information readily accessible. For example in the Preamble field there is the experiment date if I remember correctly.

So cool that you got it working! Have fun with the next steps!

prvst commented 8 years ago

@pkelchte thanks, but that's mainly because of your work too.

I now have a mzXML file with some information, most of the fields are empty, for example the struct I mentioned above (MsRun), I'm still looking into the code where I can find that.

Regarding your library, I downloaded and reorganized all functions methods, interfaces and structs into my own Golang mass spec lib. I'm trying to figure out where the information is coming from.

jke000 commented 8 years ago

I (and a couple of labs here at the Univ. of Washington) do our Thermo RAW to mzXML conversions under Wine using ReAdW. This works because we still compile ReAdW under Visual Studio 2010 as a 32-bit binary. The main issue with getting msconvert running under Wine is that they moved to Visual Studio 2013 a long time ago. Wine support for VS2013 binaries was poor even just a year ago but it looks like it the Wine project has made some progress supporting VS2013 so maybe there's hope if anyone is up for some pain in testing; I just tried it and got nowhere.

A year ago, I worked with Kaipo Tamura from Skyline/MacCoss group who compiled an older version of 32-bit msconvert under Visual Studio 2010 (last version before that project moved over to VS2013) and we got that binary to run under Wine. Of course, I can't replicate that now on the same box that worked a year ago (some vexing Side-by-Side issue that I've seen before) but at least I know that the older version of msconvert can work under Wine. Getting ReAdW running under Wine is easy by comparison.

hroest commented 8 years ago

@jke000 these are some important comments, thanks for your input. it is at least very good to hear that it works -- at least in principle :-) as you mentioned, these things are not always easy to keep working and I think this where docker could come in handy and create a stable and reproducible environment that many other people outside Univ. of Washington could use

regarding the distribution, we may still have some legal issues to consider as @prvst mentioned, it is quite likely that we wont be able to distribute a full docker container with the vendor libraries inside but possibly have to strip these libraries out and the user has to get them separately and copy them into the container... however when actually reading the licenses that you agree to when downloading pwiz (here http://proteowizard.sourceforge.net/downloads.shtml ) most of them actually allow you to re-distribute the code under certain conditions (e.g. non-commercial) but I am not a lawyer and all vendors' licences are different ...

ppedrioli commented 8 years ago

Hi Hannes,

Thanks for bringing my attention to the thread.

As Jimmy pointed out ReAdW works really well under wine. In fact, I have used it my lab for the past 5 years to convert all of the files we generated on our cluster. One of the advantages of doing it this way is the possibility to do many parallel conversions, which in turn is why I never bothered with a Docker version. However, I do agree that it might be useful in other scenarios and I might give this a go when I find some spare time. Regarding the DLLs, it might be easier (to avoid any legal questions) to just wrap the Docker in a bash script that automatically mounts them after the user downloads them.

Also, related to this topic, I can tell you that SCIEX also has a Docker container for the conversion of their files... Unfortunately, we are not at liberty to circulate it, but they might listen if we make a good case...

ypriverol commented 8 years ago

Hi all:

I was playing yesterday with windows docker containers and they are ready to be use in windows and as far as I can see the idea will be soon to be able to use windows containers in linux enviroments. Then probably a full solution can be implemented soon. In the mean time we can work in these three different solutions:

1- @ppedrioli @jke000 we can work with you to put ReAdW in a container. This will be the native ReAdW container. We would need probably the version of ReAdW that you are using and the version of wine to start. If you agree I can open an issue in the containers repo to discuss this particular implementation.

2- @prvst is working with @pkelchte support in the GO container. Probably @mvaudel know the developers from compomics that maintain this software and they can help us to move forward this version.

3- @sneumann I was talking with @pcm32 about the full wine container for Pwiz. Let us know How we can help and what we can do.

If we have some of the major vendor API in individual containers we can pack everything with the pwiz linux version and provide it as a full solution including the individual ones for the users that prefer that. I'm now in contact with people from Termo to see if they are interested to collaborate. BTW @ppedrioli do you Who is the person that we can contact in AB SCIEX to discuss this.

prvst commented 8 years ago

@ypriverol I'm not working in the container yet, I'm working on the code, adapting the reader library to my own mass spec. library. I started working with Go about a year ago, so I already have marshaller / un marshaller functions for mzXML and mzML files. So far I got about 90% from the top tags and headers and the 90% of the spectra tags.

@ppedrioli I'm curious about that working solution, do you think it is something you can share ? I'm working with a lot of conversions lately.

prvst commented 8 years ago

@pkelchte Hey, How can I get MS2...n from the file ? it seems I'm only printing MS1 Can you send me tour email? I don't want to spam this thread with technical issues.

sneumann commented 8 years ago

Hi @ypriverol , one thing that would be great are best practices/documentation on how Biodocker is doing container testing: Do you have a container (with possibly large amounts of test data) sitting on top of the production container ? Are you using any xUnit frameworks to test containers ? I'd like to have a collection of MS vendor data and test not only that conversion does not crash, but also that the result is correct w.r.t number of spectra, some (recalibrated) known masses etc. Yours, Steffen

ypriverol commented 8 years ago

Hi @sneumann we have a ongoing issue here #60

ppedrioli commented 8 years ago

Hi Everyone,

I have managed to find some time to put together a dockerized version of ReAdW here.

I have had to resort to some unorthodox workflows to configure wine (i.e. this is done after building the basic image from a standard shell).

The repository does not contain the DLLs from Thermo, which you will need to provide at build time.

Also note that the container makes use of X11 forwarding and that it will create a user with the same uid as the user that builds it.

Please give it a go and let me know how it works for you.

ypriverol commented 8 years ago

@ppedrioli I was actually working today in this. I will tested right now.

BTW: Do you thunk is possible to download pwiz directly form proteowizard repository using:

LATEST=`wget -O- http://teamcity.labkey.org:8080/repository/download/bt36/.lastSuccessful/VERSION?guest=1`
wget -O '/tmp/pwiz-setup-'$LATEST'-x86.msi' 'http://teamcity.labkey.org:8080/repository/download/bt36/.lastSuccessful/pwiz-bin-windows-x86-vc120-release-'$LATEST'.tar.bz2?guest=1'

I guess this will count like a download for ProteoWizard Team and they will be able to trace it. Also we can distribute the license as a LICENSE.md file with the container.

Thanks for doing this.

ppedrioli commented 8 years ago

@ypriverol Sorry I am not sure what you are asking for with respect to pwiz.

ReAdW is independent of Proteowizard...

ypriverol commented 8 years ago

@ppedrioli The Termo dependencies can be download from proteowizard repo using:

LATEST=`wget -O- http://teamcity.labkey.org:8080/repository/download/bt36/.lastSuccessful/VERSION?guest=1`
wget -O '/tmp/pwiz-setup-'$LATEST'-x86.msi' 'http://teamcity.labkey.org:8080/repository/download/bt36/.lastSuccessful/pwiz-bin-windows-x86-vc120-release-'$LATEST'.tar.bz2?guest=1'

We would like to build this docker container automatically. I will explore if those dependencies are in pwiz package.

ppedrioli commented 8 years ago

@ypriverol Ah, thanks for the clarification!

Yes, it would be nice to build this automagically. However, keep in mind that at least during my admittedly limited attempts, I was unable to get winetricks to play nice inside the Dockerfile.

ypriverol commented 8 years ago

ok we can investigate on that. Can you let me know Which version of MSFileReader from Termo are you using and I need to download ?

ppedrioli commented 8 years ago

@ypriverol

TBH I don't remember exactly, but I am fairly sure I got my DLLs from one of our machines running XCalibur, not from MSFileReader. Right now I don't have a windows box handy to test with MSFileReader, sorry. If you don't find anything suitable please let me know and I will try it as well. Also, @jke000 might be able to offer better guidance on this as he is also running ReAdW under wine.

ypriverol commented 8 years ago

Yes sure @ppedrioli If I manage to run this version I will open a formal request in containers repo to formalize this docker containers. Thanks a lot for your support.

jke000 commented 8 years ago

We've compiled ReAdW to work with either Xcalibur or MSFileReader libraries but there is a separate ReAdW binary for each library. Here are very simple instructions for getting ReAdW running under Wine using Thermo's MSFileReader library:

http://proteomicsresource.washington.edu/protocols06/wine/

Essentially just download MSFileReader version of ReAdW binary from GitHub, MSFileReader from Thermo, and issue one Wine command to install MSFileReader. That's it.

On Wed, Oct 5, 2016 at 6:33 AM, Patrick Pedrioli notifications@github.com wrote:

@ypriverol https://github.com/ypriverol

TBH I don't remember exactly, but I am fairly sure I got my DLLs from one of our machines running XCalibur, not from MSFileReader. Right now I don't have a windows box handy to test with MSFileReader, sorry. If you don't find anything suitable please let me know and I will try it as well. Also, @jke000 https://github.com/jke000 might be able to offer better guidance on this as he is also running ReAdW under wine.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/BioContainers/specs/issues/64#issuecomment-251675652, or mute the thread https://github.com/notifications/unsubscribe-auth/AJAw0WO8q83NToXNVHTuw8OgSmfHfA-Oks5qw6dDgaJpZM4KJMps .

ypriverol commented 8 years ago

Hi all: Thanks to @ppedrioli and @jke000 for your support we have the first container for Thermo RAW to mzXML, here is the container https://github.com/BioContainers/local-containers/tree/master/ReAdW A couple of issues here:

I created a local-containers repository for containers that can't be deploy in public servers due licence issues. In this example we need to request from Thermo the original DLLs. @ppedrioli @jke000 do you think we can add the DLLs if we add the Thermo license into the container. This will enable us to deploy the container. This will also enable the move of the container to containers as a recipe.

@sneumann @hroest can you test the container in your side.

Again thanks to @ppedrioli for the first container.

prvst commented 8 years ago

Hey @pkelchte, can you send me your contact (or gitter/slack), I want to ask you some things about the lib.

ypriverol commented 7 years ago

@ppedrioli yesterday in a metabolomics meeting someone ask me if is possible to extend readW to get more information out of the raw files. Then two questions: Is readW open source.? Are you willing to collaborate?

Regards Yasset

pcm32 commented 7 years ago

Yes, we are interested in seeing this going forward as well. I have forwarded this issue to several people. Ralf Weber in Birmingham is the person that @ypriverol mentioned in his last comment.

ypriverol commented 7 years ago

Hi all: Yesterday in a Metabolomics meeting at EBI with more than 40 participants of the Phenomenal Project http://phenomenal-h2020.eu/home/ the problem of RAW conversions was mentioned as one of the major issues stooping the project and the reuse/share of the data. I see sow far two main issues here:

Reuse at large scale of the proteomics/metabolomics, this is complex now because everything is fixed to windows environments when most of the HPC facilities are non Linux based.
Fixing a complete field to one environment and limiting the growth of opensource software wit the corresponding problems (reproducible, portability, etc)
Waste of public money to make possible the portability of this software to other environments.

In the past we follow the approach to create file formats like mzXML, mzML., etc but we should move this openly by moving the community in favour of portable API. We have an strong case for companies:

This will open the reuse of the proteomics data at large scale.
We can start to exchange "RAW" files without needs to transforms to new files, moving the responsibility to the software side rather than the files. I will always prefer to keep the original RAW data rather than the transformed files.

I just create different channels to discuss and give our opinion about this. This thread is one, and this twitter thread is another one: https://twitter.com/ypriverol/status/802102443920281600 we need to actually arrive to an agreement and not continue creating path to this problem. If I'm to optimistic or creating an issue that do not exist, then this thread also will demonstrate that.

This is my modest opinion.

If you agree leave your vote, if you want to say more please leave your comment.

hroest commented 7 years ago

I dont think that we should store all of our data in a proprietary, closed format which we can only access with pre-compiled binaries (not sure whether I understood you correctly here). This will not allow us to do with the data what we want and may make it impossible/unpractical to access older data when a company stops existing or does not offer a RAW reader any more for an older format of theirs. Independent of whether an API exists for RAW data, I believe the data itself should be stored in an open format for which multiple, open-source APIs exist which will allow all programmers on all platforms to access the data.

sneumann commented 7 years ago

Recently, there was success on the Thermo conversion using msconvertGUI in this container: https://github.com/meier-rene/MSConvertGUI-docker . This can convert from .raw to .mzML, but not in unattended fashion and relying on GUI and mouse.

prvst commented 7 years ago

Hey @sneumann ;

Thanks for that.

I just tried that and got the same error messages I get when working with a simple wine installation locally, did you got any luck run it?

sneumann commented 7 years ago

The container by rene works nicely for bruker and thermo via the gui. Which error do you get? hangs while writing out the mzML? Yours Steffen

I blame Android for the brevity and typos

sneumann commented 6 years ago

Hi @ppedrioli , is this https://hub.docker.com/r/sciex/wiffconverter/ the container you referred to ? I was unable to get it to work on the first attempt, there is no ENTRYPOINT for docker run -it sciex/wiffconverter:0.7.

I found the executable in /usr/local/bin/sciex/wiffconverter and the image seems to be an Ubuntu. But there is no wine to execute the OneOmics.WiffConverter.exe. Any ideas ? What docker invocation might convert my *.wiff ? Yours, Steffen

hroest commented 6 years ago

have you tried

docker run sciex/wiffconverter:0.7 mono /usr/local/bin/sciex/wiffconverter/OneOmics.WiffConverter.exe

on my system that works and produces some output? Then you can run it as follows

docker run -v /path/to/whatever/:/data:rw sciex/wiffconverter:0.7 mono /usr/local/bin/sciex/wiffconverter/OneOmics.WiffConverter.exe WIFF /data/testfile.wiff -profile MZML /data/testfile.mzML

sneumann commented 6 years ago

Excellent, the executable starts, I was not aware that I can run an *.exe not only with wine, but also with mono. Yours, Steffen

BioContainers / specs

Raw file reader/converter #64