ga4gh / ga4gh-server

Reference implementation of the APIs defined in ga4gh-schemas. RETIRED 2018-01-24
http://ga4gh.org
Apache License 2.0
96 stars 91 forks source link

Expert review of the read translation code required #347

Open jeromekelleher opened 9 years ago

jeromekelleher commented 9 years ago

There are multiple TODO items in the ga4gh/datamodel/reads.py file, particularly where we translate from pysam objects into the GA4GH equivalents (see the convertReadAlignment method). None of the current developers have the expertise required to be certain that the decisions made here are correct. We would very much appreciate some outside help on this.

dcolligan commented 9 years ago

There's also the GA4GH -> pysam conversion code in ga4gh/converters.py

bioinformed commented 9 years ago

I won't claim to be an expert, but I'll try to find time to review the conversion code.

dcolligan commented 9 years ago

That would be awesome if you could, @bioinformed

jeromekelleher commented 9 years ago

@jmarshall --- It seems you have been having a look over our read translation code (which we are very grateful for!). Do you think we can mark this issue as done?

dcolligan commented 9 years ago

I don't think this is close to done. There are many TODOs in the relevant parts of ga4gh/datamodel/reads.py. I'm sure there are more bugs in there. On the other hand, given our uncertainty, this may be an otherwise eternally-open bug that serves no use and might be worth closing for that reason.

jmarshall commented 9 years ago

So far I've only been reacting to issues that go past in my @ga4gh news feed… but it has occurred to me that I'm risking accidentally volunteering to review reads.py in general :smile:

Okay, will continue to have a look, and I agree that there's still plenty of TODOs to look at.

Is the idea that people can run this server pointed at pretty much any BAM files they want to serve up, or are there constraints on the BAM files that the server likes?

jeromekelleher commented 9 years ago

Is the idea that people can run this server pointed at pretty much any BAM files they want to serve up, or are there constraints on the BAM files that the server likes?

At the moment, yes, we'd like to be able to point to an arbitrary (well formed) bam file and do something sensible. We can introduce restrictions if we need to though --- I think at the moment we don't really understand the range of inputs we can expect, so we haven't made any requirements.

Thanks for looking at this @jmarshall!