openworm / tracker-commons

Compilation of information and code bases related to open-source trackers for C. elegans
11 stars 12 forks source link

Create converter for BDML #113

Closed MichaelCurrie closed 7 years ago

MichaelCurrie commented 8 years ago

They have a publication out already on their format: http://bioinformatics.oxfordjournals.org/content/31/7/1044

The BDML schema is here: http://ssbd.qbic.riken.jp/bdml/bdml0.2.xsd Our WCON schema is here: https://github.com/openworm/tracker-commons/blob/master/wcon_schema.json

http://www.qbic.riken.jp/english/news/highlight/bdml.html

We were introduced to this format by @DrKenHo on the gitter chat system on 23 May 2016.

We have put 11 sets of quantitative experimental data of worm movement done by (Cronin et al., 2005) in our SSBD database here: http://ssbd.qbic.riken.jp/?value=%22cronin%22+%5Bcontributors%5D&form-submit=Search We use a XML format (BDML) to encode the data and details of the fomart can be found here : http://ssbd.qbic.riken.jp/publications/ Software to access via REST API using Python as well as ImageJ plugins can be found here http://ssbd.qbic.riken.jp/software/

Ichoran commented 8 years ago

I don't think it's entirely obvious that we should write this converter instead of them, but it would be nice if someone did.

cheelee commented 8 years ago

There's the balance of need + time vs effort. I'd have to check if the Cronin paper's 11 sets of movement data on C. Elegans is intended to be placed in public domain under that format, but if it is then we'd have an early case where peer-reviewed published data is made available in two open formats for study by other scientists (if someone gets a working converter going.)

DrKenHo commented 8 years ago

We are happy to contribute on WCON to BDML converter especially if any of you have experimental or simulation data that you want to store in SSBD database or important data that you think others should have easy access to them. We normally accept peer reviewed published data such that we can provide some assurances of the quality of the data stored in our database. However, we are starting to get requests to store data in pre-published state, so this policy may change in the future.

As to BDML to WCON conversion, it will probably be difficult for us to contribute directly because we don't know the requirement that you need to run say the OpenWorm simulation, etc.

DrKenHo commented 8 years ago

With regard to Cronin et al's data, we contacted them and asked for their permission. I am sure they will be happy to have it in WCON as well.

DrKenHo commented 8 years ago

One other thing with regard to BDML versions. BDML0.20 was developed based on reviewers of our paper and their requests to include RDF and ontology terms in the format. At present, SSBD and OpenSSBD do not support BDML0.20 yet. OpenSSBD supports up to BDML0.15 and SSBD supports BDML0.18.

For implementation at this stage, I would recommend to do the converter based on BDML0.18 instead of BDML0.20. The schema for BDML0.18 is here:

http://ssbd.qbic.riken.jp/bdml/bdml0.18.xsd

Ichoran commented 8 years ago

@DrKenHo - It should be fairly straightforward to use one of the WCON reader/writer implementations to read WCON data, and then convert the in-memory representation to something you can write as BDML. Our hope is that WCON is simple and limited enough that it is not very hard to at least access the data stored in that format.

As far as BDML to WCON goes, you might find after writing a WCON to BDML converter that it's obvious how to do it the other way. If not, I agree that it makes more sense for us to pull out the data that we are interested in than for your team to try to guess.

MichaelCurrie commented 7 years ago

For now it looks like we will not be implementing this, but it's possible when a use case arises in the future, we will implement this. It is great that we have mapped out the representation differences here for that subsequent future effort.

DrKenHo commented 7 years ago

We are also moving to a binary format using HDF5. I shall post it when we get a paper out on the format. It may well be better for doing analysis directly on the data in the future.

MichaelCurrie commented 7 years ago

That would be interesting to see. @DrKenHo if you in UCLA for the worm meeting next week please let me know and we can meet!

DrKenHo commented 7 years ago

Sorry @MichaelCurrie, I am not planning to attend the UCLA meeting but a couple of my colleagues are going there. I shall find out their schedule and see whether you can meet them instead.

MichaelCurrie commented 7 years ago

Sure, @DrKenHo please ask them to email mcurrie@openworm.org. Thanks