ypriverol / CompProt2019-paper

This repository store the content of a Computational Proteomics manuscript Trends and Opportunities by 2019
Creative Commons Attribution 4.0 International
3 stars 2 forks source link

Suggested topics #5

Open prvst opened 5 years ago

prvst commented 5 years ago
bittremieux commented 5 years ago

I'd also add a discussion about smaller packages. The poll was geared towards stand-alone tools, but there are also a bunch of Python/R/C++ packages that contain re-usable components. No need to exhaustively list all of those, because there are just way too many and most of them are not used very broadly, but just to point out that a lot of stuff exists so it's not necessary to reinvent the wheel every time.

higsch commented 5 years ago

Good point, @bittremieux! In that context, we should also mention Nextflow to chain these tools and to make their execution reproducible.

bittremieux commented 5 years ago

In that case we could also mention other workflow tools such as Airflow, Luigi, Snakemake, ... I'm not sure whether we want to go that broad though. It would be good to clearly specify what message we want to convey with this manuscript and create an outline.

jspmccain commented 5 years ago

Further to @bittremieux , I think would be very valuable to discuss the development of modular vs. workflow tools, and the community uptake rates of either. Certain tools seem to tackle one specific problem, and aim to be made in a modular way, such that they could be adopted in a given pipeline. Other tools seem to be made as one monolithic pipeline.

From the end-user perspective, maybe the monolithic approach is easier? (if you can get it installed!) But from a new developer perspective, it's much easier to make a modular component I think?

trishorts commented 5 years ago

One topic that I think needs discussion regards the use of standard formats (e.g. mzML). I'm a proponent of these because it makes communication between users and programs more effective. One improvement that we could make as a community is to provide code for testing compatibility and compliance. And, maybe more importantly, libraries (java, python and nuget for C#) of code that both read and write the standard format. These could then be included in any program that used them. One big problem with creating mzML writers and readers is the massive amount of info that you have to deal with. If the creators of the standard format gave everyone the tools for reading and writing, that would really promote use and limit breakdowns.

bittremieux commented 5 years ago

For some (most?) of the PSI standards official reference implementations (often in Java) are available. See for example jmzML, jmzIdentML, and jmzTab. Lists of other supporting tools are also available on the PSI website: mzML, mzIdentML, mzTab.

If these tools and APIs are not widely known it could be useful to explicitly point this out.