galaxyproteomics / tools-galaxyp

Galaxy Tool Shed repositories maintained and developed by the GalaxyP community
MIT License
34 stars 57 forks source link

Skyline #208

Open Stortebecker opened 6 years ago

Stortebecker commented 6 years ago

Does anyone have experience with Skyline?

It seems to be the standard for MRM (targeted proteomics), source code is available and it is licensed with the Apache 2.0 license. However, it is made for Windows only and uses 3rd party libraries with more restrictive licenses (similar problem like with msconvert). Do you know if these other licenses are absolutely needed? Or can we circumvent them by converting to mzML in beforehand?

bgruening commented 6 years ago

xref: https://skyline.ms/_webdav/home/software/Skyline/@files/docs/Skyline%20Command-Line%20Interface-2_1.pdf for a very minimal set of commandline interface. Currently no one had the need for it afaik. I suppose it is a lot of work to get this HPC ready.

chambm commented 6 years ago

Yes, the 3rd party licenses are only necessary for converting from vendor formats, and if compiled without them, it's pure Apache 2 or other compatible licenses, and will work for mzML and other open formats. I'm not sure whether the Skyline devs have ever actually focused on checking whether Skyline works without the vendor libraries compiled in though - it probably doesn't build without the vendor libraries right now.

Stortebecker commented 6 years ago

Hm, but maybe we could ask the developers to test it or even provide it excluding the 3rd party stuff. It is a ProteoWizard project like msconvert, which also works without the vendor libraries.

Maybe it is worth the effort. I know that Skyline is used quite a lot, e.g. at the proteomics core facility of the Helmholtz center in Munich. I also heard it mentioned from colleagues working in the US. I have never worked with MRM before, but it's getting more common.

chambm commented 6 years ago

Another major reason Skyline doesn't make a lot of sense in Galaxy is that it's heavily user-interaction-oriented. You don't typically just set a bunch of settings up front and look at the numbers that come out. If a transition (precursor/product pair) doesn't look right, you interact with it, tweak settings until it does look good, then repeat for all other transitions that don't look right. The sheer number of possible outputs that Skyline has would be a nightmare to wrap in Galaxy I suspect, especially since in a workflow, all outputs are visible even if they are not selected. Perhaps there is some common subset of outputs which would make a decent tool. Who is the target user for this proposed Galaxy tool?

Stortebecker commented 6 years ago

@chambm Ok, I understand your reasoning in terms of finding the right transition. I was more thinking of a use-case, where an MRM assay is already established and Skyline is used for the routine analysis of many datasets. Is this a function of Skyline after all? Or is it just used for MRM design?

brendanx67 commented 6 years ago

Absolutely, Skyline is used in this way more and more, and we do have a command-line interface. https://skyline.ms/wiki/home/software/Skyline/download.view?entityId=c33e6082-9c2a-102f-a8bb-da20258202b3&name=Skyline%20Command-Line%20Interface-3_7.pdf Our own tool AutoQC Loader uses this command-line interface to process system suitability runs as they are completed by the instruments and upload them automatically to a Panorama Server (often http://panoramaweb.org/). But we were originally thinking along the lines of taking a Skyline template method document, importing results and then exporting a template report, which would then be consumed by a downstream tool. I have done a lot of this myself to very positive effect with R as the downstream tool for generating plots or recording summary tables for future multi-data set plots.

You can even get a sense of this for DIA data in the two most recent Skyline Tutorial Webinars: https://skyline.ms/webinar14.url https://skyline.ms/webinar15.url

It does seem conceivable that we could produce a fully Apache friendly version of Skyline (with Matt's help) by limiting it to importing open formats (and no library building from Mascot .dat files). That would still be a Windows-only executable, but we have included getting this running on Linux in our current R01 aims, and .NET core seems like it could make this possible.

brendanx67 commented 6 years ago

Here also is a support request from a Skyline user packaging Skyline for command-line pipeline use in Docker:

https://skyline.ms/announcements/home/support/thread.view?rowId=29076

Jarrett Egertson in the MacCoss lab has also been working on Dockerized Skyline automated processing.

Note that the licensing is such that you can redistribute your derivative works as long as they are not for commercial gain. That is the biggest difference between current Skyline packaging and true Apache licensing, where derivative works can be redistributed for commercial gain. But you can definite redistribute freely available work based on the ProteoWizard data access layer.

chambm commented 6 years ago

Ugh. :man_facepalming: Of course it's still Windows only even without vendor libraries. It uses C++/CLI mixed mode assemblies to get data from ProteoWizard. Refactoring this to use something that can also work on Linux/Mono (I don't think .NET core is proposing to support C++/CLI at all) is quite frequently requested. But it's not going to be simple.

nilshoffmann commented 4 years ago

@chambm @brendanx67 Would you recommend using the Skyline / Proteowizard Docker container instead of a native assemly nowadays for running on a Linux host? https://hub.docker.com/r/chambm/pwiz-skyline-i-agree-to-the-vendor-licenses

chambm commented 4 years ago

Yes, that's definitely the way to run on Linux now. Only with SkylineCmd from command-line, of course. The GUI aspects aren't working.