ga4gh / ga4gh-schemas

Models and APIs for Genomic data. RETIRED 2018-01-24
http://ga4gh.org
Apache License 2.0
214 stars 114 forks source link

Human-readable documentation / explicit use cases missing #256

Open nouyang-curoverse opened 9 years ago

nouyang-curoverse commented 9 years ago

The mass of auto-generated API doc, http://ga4gh.org/documentation/api/v0.5.1/ga4gh_api.html#/, while infinitely better than nothing, is now hindering distilling / reaching clarity and consensus about some simple issues and edge cases (in my opinion).

Can we move to include more human-readable documentation along with the Avro schemas, or at least in our Github Issues discussions?

I think this will resolve a lot of confusion being generated because people are submitting solutions/pull-requests that resolve specific use cases. Without making the use cases we have in mind explicit to each other, the fact that our solutions are conflicting and need a step back to resolve may not be readily apparent.

Additionally, explicit use cases will easily allow us to write end-user oriented documentation in the future (where here the end-users are actually developers for various institutions / software packages).

fnothaft commented 9 years ago

Strong +1

richarddurbin commented 9 years ago

I also have found for the reference graph variation API that the avro docs have become very bloated with documentation, so that changes in the spec are hard to pick out of a sea changes in the documentation. So I would also prefer having much lighter weight avro, just with documentation on field values, and for the overview of the design and use case examples, discussion etc. to be recorded in a parallel document designed to be read. Maybe we could agree on a format for these parallel documents and start them, then progressively migrate over some of the descriptive material from the avro docs?

Richard

On 12 Mar 2015, at 20:05, Frank Austin Nothaft notifications@github.com wrote:

Strong +1

— Reply to this email directly or view it on GitHub.

The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.

ekg commented 9 years ago

I think we should use the wiki on the github page.

On Fri, Mar 13, 2015 at 9:17 AM, Richard Durbin notifications@github.com wrote:

I also have found for the reference graph variation API that the avro docs have become very bloated with documentation, so that changes in the spec are hard to pick out of a sea changes in the documentation. So I would also prefer having much lighter weight avro, just with documentation on field values, and for the overview of the design and use case examples, discussion etc. to be recorded in a parallel document designed to be read. Maybe we could agree on a format for these parallel documents and start them, then progressively migrate over some of the descriptive material from the avro docs?

Richard

On 12 Mar 2015, at 20:05, Frank Austin Nothaft notifications@github.com wrote:

Strong +1

— Reply to this email directly or view it on GitHub.

The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.

— Reply to this email directly or view it on GitHub https://github.com/ga4gh/schemas/issues/256#issuecomment-78881818.

mbaudis commented 9 years ago

@ekg Yes; I had pushed for the same in MTT calls.

richarddurbin commented 9 years ago

+1

I support this. Perhaps we can discuss on an upcoming DWG call, Stephen.

On 13 Mar 2015, at 09:35, Erik Garrison notifications@github.com wrote:

I think we should use the wiki on the github page.

On Fri, Mar 13, 2015 at 9:17 AM, Richard Durbin notifications@github.com wrote:

I also have found for the reference graph variation API that the avro docs have become very bloated with documentation, so that changes in the spec are hard to pick out of a sea changes in the documentation. So I would also prefer having much lighter weight avro, just with documentation on field values, and for the overview of the design and use case examples, discussion etc. to be recorded in a parallel document designed to be read. Maybe we could agree on a format for these parallel documents and start them, then progressively migrate over some of the descriptive material from the avro docs?

Richard

On 12 Mar 2015, at 20:05, Frank Austin Nothaft notifications@github.com wrote:

Strong +1

— Reply to this email directly or view it on GitHub.

The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.

— Reply to this email directly or view it on GitHub https://github.com/ga4gh/schemas/issues/256#issuecomment-78881818.

— Reply to this email directly or view it on GitHub.

The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.

richarddurbin commented 9 years ago

+1

I’ll put it on the Agenda for the 25th. We need a person or people need to take ownership of this to make it happen.

On 13 Mar 2015, at 11:19, Richard Durbin rd@sanger.ac.uk wrote:

+1

I support this. Perhaps we can discuss on an upcoming DWG call, Stephen.

On 13 Mar 2015, at 09:35, Erik Garrison <notifications@github.com mailto:notifications@github.com> wrote:

I think we should use the wiki on the github page.

On Fri, Mar 13, 2015 at 9:17 AM, Richard Durbin <notifications@github.com mailto:notifications@github.com> wrote:

I also have found for the reference graph variation API that the avro docs have become very bloated with documentation, so that changes in the spec are hard to pick out of a sea changes in the documentation. So I would also prefer having much lighter weight avro, just with documentation on field values, and for the overview of the design and use case examples, discussion etc. to be recorded in a parallel document designed to be read. Maybe we could agree on a format for these parallel documents and start them, then progressively migrate over some of the descriptive material from the avro docs?

Richard

On 12 Mar 2015, at 20:05, Frank Austin Nothaft <notifications@github.com mailto:notifications@github.com> wrote:

Strong +1

— Reply to this email directly or view it on GitHub.

The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.

— Reply to this email directly or view it on GitHub <https://github.com/ga4gh/schemas/issues/256#issuecomment-78881818 https://github.com/ga4gh/schemas/issues/256#issuecomment-78881818>.

— Reply to this email directly or view it on GitHub https://github.com/ga4gh/schemas/issues/256#issuecomment-78888202.

-- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.

pgrosu commented 9 years ago

++1 - This is extremely important! For me it took one month back in May of last year, to completely get up to speed using the issues to support the reasoning in the API, though at the time we kept changing our approaches more frequently.

diekhans commented 9 years ago

+1 on documentation in git separate from avdl files. -1 on using the wiki

It's vitally important that full, normative documentation be part of GA4GH API. A wiki is insufficient for producing API and software documentation. The documentation needs to be under source control and be part of the release process. A pull request would include the documentation. It would not be accepted until the documentation is adequate. When a release is made, the release will include documentation that exactly matches the API and code. The documentation is automatically built as part of building the code.

This is the only approach I have seen in decades of software engineering that actually works, and works well It isn't a huge burden and helps make developers responsible for documenting their work.

There have been many systems develop to generate integrated documentation from a combination of code and code comments, with larger pieces of documentation, images, etc. Javadoc does a nice job of this in the java world. Sphinx is a really nice tool use by python and other languages. It can generate beautiful web pages and PDFs. While there currently isn't a Sphinx adapter for avdl, although it should be reasonable straight-forward to write.

Mark

Richard Durbin notifications@github.com writes:

I also have found for the reference graph variation API that the avro docs have become very bloated with documentation, so that changes in the spec are hard to pick out of a sea changes in the documentation. So I would also prefer having much lighter weight avro, just with documentation on field values, and for the overview of the design and use case examples, discussion etc. to be recorded in a parallel document designed to be read. Maybe we could agree on a format for these parallel documents and start them, then progressively migrate over some of the descriptive material from the avro docs?

Richard

On 12 Mar 2015, at 20:05, Frank Austin Nothaft notifications@github.com wrote:

Strong +1

— Reply to this email directly or view it on GitHub.

The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.

— Reply to this email directly or view it on GitHub.*

pgrosu commented 9 years ago

+1, I would have to concur with Mark on this. Technical documentation that stay close to the definitions in the code - Javadoc, R, etc. - definitely streamline the integration and implementation process.

https://readthedocs.org/ could provide that nice integration with what we are doing.

diekhans commented 9 years ago

This will not produces as pretty a results but will solve the most important problem: keeping the doc in-sync with the code.

We can also add a gitdoc -> sphink pipeline in the future, they both use a markdown

Benedict Paten benedict@soe.ucsc.edu writes:

On Fri, Mar 13, 2015 at 9:34 AM, Mark Diekhans markd@soe.ucsc.edu wrote:

+1 on documentation in git separate from avdl files.
-1 on using the wiki

In my opinion there should be coordinated version control on the docs alongside the schemas - the github wikis (while technically under version control), do not make this synchrony with the main repository easy. 

However, we could create a sort of wiki by using github flavored markdown files (https://help.github.com/articles/github-flavored-markdown/)  in a docs folder of the schemas  repo. This is then very much like the wiki (in editing terms), but is under version control with the main schemas - and so can evolve and be versioned alongside. The markdown files are very easy to edit, and can be done directly from the github site (using their editor). 

 

It's vitally important that full, normative documentation be
part of GA4GH API.  A wiki is insufficient for producing API and
software documentation.  The documentation needs to be under
source control and be part of the release process.  A pull
request would include the documentation. It would not be
accepted until the documentation is adequate.  When a release is
made, the release will include documentation that exactly
matches the API and code.  The documentation is automatically
built as part of building the code.

This is the only approach I have seen in decades of software
engineering that actually works, and works well It isn't a huge
burden and helps make developers responsible for documenting
their work.

There have been many systems develop to generate integrated
documentation from a combination of code and code comments, with
larger pieces of documentation, images, etc.  Javadoc does a
nice job of this in the java world.  Sphinx is a really nice
tool use by python and other languages.  It can generate
beautiful web pages and PDFs.  While there currently isn't a
Sphinx adapter for avdl, although it should be reasonable
straight-forward to write.

Mark

Richard Durbin <notifications@github.com> writes:
> I also have found for the reference graph variation API that the avro
docs have
> become very bloated with documentation,
> so that changes in the spec are hard to pick out of a sea changes in the
> documentation.
> So I would also prefer having much lighter weight avro, just with
documentation
> on field values, and for the overview
> of the design and use case examples, discussion etc. to be recorded in a
> parallel document designed to be read.
> Maybe we could agree on a format for these parallel documents and start
them,
> then progressively migrate over some
> of the descriptive material from the avro docs?
>
> Richard
>
> On 12 Mar 2015, at 20:05, Frank Austin Nothaft <notifications@github.com>
> wrote:
>
> > Strong +1
> >
> > —
> > Reply to this email directly or view it on GitHub.
> >
>
>
>
>
> --
> The Wellcome Trust Sanger Institute is operated by Genome Research
> Limited, a charity registered in England with number 1021457 and a
> company registered in England with number 2742969, whose registered
> office is 215 Euston Road, London, NW1 2BE.
>
> —
> Reply to this email directly or view it on GitHub.*
>
nouyang-curoverse commented 9 years ago

-0, I opened this issue with "human-readable" for a reason, see http://stevelosh.com/blog/2013/09/teach-dont-tell/#the-reference

I feel a particularly deep form of rage every time I click on a “documentation” link and see auto-generated documentation. There’s no substitute for documentation written, organized, and edited by hand.

I agree that if we're not in a stable enough state or anyone willing to take charge of human readable docs, then at least we should have up-to-date auto-generated docs (manpages). But that is not why I opened this Issue.

Specifically, right now if someone says to me "How do I ask a Beacon v0.2 for variants that say 'AAGA' at hg19, chr17:3857" neither they nor even many of the github contributors or anyone on any other task team has any idea what the answer is since all they can see is the Avro schema and a lot of meandering conversation on 12 different issues. And I think the answer should be very self-evident and explicitly recorded in our github conversations (if we are sticking with conversations on github).

I'm not certain about the status of other avdls, so perhaps I should explicitly state that I only know this is an issue on beacon.avdl

https://github.com/ga4gh/schemas/blob/master/src/main/resources/avro/beacon.avdl

pgrosu commented 9 years ago

@nouyang-curoverse, I agree with you and I think that is why @adamnovak created the following:

https://github.com/ga4gh/schemas/pull/242

I think we need to maybe take a step back and just write a concise rough doc with some diagrams where all the pieces fit together. I think it will then sync to an extent all the pieces of the different projects. I have seen in the past where implementations became so divergent that they did not properly talk to each other that a significant rewrite was required. I understand these are API definitions, but it might be good to create a summary report of all the projects that describe the design implementations and motivations of each. This would include the exact implementation state they are in currently with next-step action items. This would helpful to be updated periodically. At that stage careful discussions and implementations of integration would help, to ensure that all the API parts smoothly communicate and integrate with each other. This is how API teams interface with other key members/project-teams on large and complex projects in implementing a complete set of products/framework in industry throughout the software development lifecycle. This will ensure all parts are in sync and on track so no surprises creep up - which does require periodic revisiting all issues and where they stand and they fit properly with all the stated and defined goals.

Having said that - which will synchronize the projects - I still think that API-specific docs will ensure implementations are created based on specific descriptions that list the definitions, assumptions and properties, which include examples that can be turned into test-cases. This only follows after what I stated in the previous paragraph is thoroughly vetted periodically.

Paul

fnothaft commented 9 years ago

I agree with @diekhans about having the docs in the repo, but also agree with @ekg that it should look like a wiki. However, I don't think there's a huge gap between the two. If you just create a docs folder and drop markdown files in there, the files are rendered as if they are wiki-style Markdown.

At Berkeley, we use pandoc to generate documentation. We keep the documentation in the project git repo, and we then package it up with each release (and optionally with each build).

@nouyang-curoverse, I agree with you and I think that is why @adamnovak created the following:

242

242 isn't what I would call human readable documentation. They are generated API docs which are insufficient for understanding how a large system interacts. There need to be higher level human generated, human readable docs. I don't think that API docs are bad—I find them quite useful—but standalone API docs are insufficient.

ttriche commented 9 years ago

Slight digression regarding my own effort to condense the schema docs:

At one point I tried dumping the Avro schema into something that could be automatically diagrammed (dot, Gviz, d3, who cares how). Couldn't make it work and gave up after an hour or two. But if the dynamically generated schema figures/doc came after the human-readable rationale for why the schema exists as such, that might address both problems simultaneously. I think a post from Kenna Shaw provoked my own effort; someone very smart who didn't have much time to spend reading AVDLs. If the API built on these schemas is to be adopted, individuals in that sort of position will need a technically correct but concise and persuasive explanation of why they should care. Figures similar to the UML diagrams previously posted, but dynamically generated, could serve as the exploded parts diagram, reserving the text of the docs for relevant discussion of why things are as they are, and what problems they solve.

"To retrieve a proper MHC reference structure under the current representation, we do XYZ. This is inefficient and incomplete. With the GA4GH API, the same query returns ABC as an instance of type Foo, obviating the problem and allowing users to ask far more clinically relevant questions of the data (see figure 123 for specifics of Foo)"

Basically a vignette, in Bioconductor terms. That aspect of BioC -- all code MUST have a substantial example of its application that is successfully run at build time every night -- is perhaps the best user-facing advertisement for that particular project, not least because google then indexes the generated vignettes every night. Some of the vignettes have more citations than the refereed journal articles describing the same software.

If people are voluntarily going to adopt the API as service providers, it would help if its documentation revealed what problems (unsolved by current approaches) are addressed by the GA4GH schemas.

Haussler's epic tome, for example, lives in a Git pull request comment. That is probably not optimal for widespread adoption...

JMHO

--t

On Mar 13, 2015, at 12:09 PM, Paul Grosu notifications@github.com wrote:

@nouyang-curoverse, I agree with you and I think that is why @adamnovak created the following:

242

I think we need to maybe take a step back and just write a concise rough doc with some diagrams where all the pieces fit together. I think it will then sync to an extent all the pieces of the different projects. I have seen in the past where implementations became so divergent that they did not properly talk to each other that a significant rewrite was required. I understand these are API definitions, but it might be good to create a summary report of all the projects that describe the design implementations and motivations of each. This would include the exact implementation state they are in currently with next-step action items. This would helpful to be updated periodically. At that stage careful discussions and implementations of integration would help, to ensure that all the API parts smoothly communicate and integrate with each other. This is how API teams interface with other key members/project-teams on large and complex projects in implementing a complete set of p roducts/framework in industry throughout the software development lifecycle. This will ensure all parts are in sync and on track so no surprises creep up - which does require periodic revisiting all issues and where they stand and they fit properly with all the stated and defined goals.

Having said that - which will synchronize the projects - I still think that API-specific docs will ensure implementations are created based on specific descriptions that list the definitions, assumptions and properties, which include examples that can be turned into test-cases. This only follows after what I stated in the previous paragraph is thoroughly vetted periodically.

Paul

— Reply to this email directly or view it on GitHub.

lh3 commented 9 years ago

By human readable documentation, I guess we mean users' manual or perhaps cookbook. I agree we need that. As to the format, I am also -1 on wiki. I'd prefer something that is more explicitly versioned and can be converted to a well-formatted PDF book. Pandoc seems good, though I haven't used it before.

pgrosu commented 9 years ago

+1 Tim

+1 Heng on the PDF and cookbook. Reading PDFs brings me happiness :)

nouyang-curoverse commented 9 years ago

+1 to cookbook.

Re: pulling out "use case examples, discussion" @pgrosu and @richarddurbin -- I believe MFiume and I plan on doing this for Beacon sometime next week. I'll check back in when that's done, and perhaps it can serve as a template for the other APIs, and after that we can work on achieving API sync across the task teams.

p.s. Can someone please add a Beacon Label / who can assign labels to issues? I still can't assign labels, it seems, and we're in sore need of more issue categorization.

Aside: I personally find more happiness reading well-designed websites than PDFs, PDFs are hard to search, copy-paste, or maintain multiple "tab" views, and take a long time to load, among other issues. The Atmel PDFs [1] are well-formatted, but leave something to be desired from the beginner's perspective). So I'm happy so long as the source files for our human-readable cookbook are some plaintext doc format and PDF is just one output.

[1] http://www.atmel.com/images/Atmel-8271-8-bit-AVR-Microcontroller-ATmega48A-48PA-88A-88PA-168A-168PA-328-328P_datasheet_Complete.pdf

pgrosu commented 9 years ago

@nouyang-curoverse, sounds good and look forward to it. After having taken both FPGA and OS courses, to me this manual doesn't look so bad, but this is for a different domain. It's all about context. Sometimes reading papers (as PDFs) in machine learning or genetics I would find relaxing, but some good beer with a great soccer game on TV will always be more fun :)

~p

diekhans commented 9 years ago

In many ways, the documentation is far more important than the schema. It's critical to capture the rational for doing things a certain way.

Paul Grosu notifications@github.com writes:

++1 - This is extremely important! For me it took one month back in May of last year, to completely get up to speed using the issues to support the reasoning in the API, though at the time we kept changing our approaches more frequently.

— Reply to this email directly or view it on GitHub.*

adamnovak commented 9 years ago

242 isn't just generated documentation. It also has FAQs which is where things like "How do I ask a Beacon v0.2 for variants that say 'AAGA' at hg19, chr17:3857" might eventually go (maybe in BeaconFAQ.md).

nro-bot commented 9 years ago

@adamnovak I guess I still don't understand #242 . It seems to be a lot of syntax for generating SVG files automatically and a random documentation file (for graphs)... instead of the actual human-readable SVG files and a skeleton of doc topics that will facilitate discussion. Perhaps adding a "Purpose of this /doc Folder" would be useful.

Relequestual commented 9 years ago

I attempted to find out how to implement the "current" beacon API. I genuinly am unable to work it out. There is an avro schema, and no documentation that explains how I should use it, nor any examples. As someone looking into beacon for the first time, I'd be at a total loss as to what I should do.

ekg commented 9 years ago

I guess we should just use standard techniques for communicating these. For the Beacon API, just write a document and provide example code (perhaps inline). Precise specifications can follow.

On Wed, Jul 15, 2015 at 2:10 PM, Ben Hutton notifications@github.com wrote:

I attempted to find out how to implement the "current" beacon API. I genuinly am unable to work it out. There is an avro schema, and no documentation that explains how I should use it, nor any examples. As someone looking into beacon for the first time, I'd be at a total loss as to what I should do.

— Reply to this email directly or view it on GitHub https://github.com/ga4gh/schemas/issues/256#issuecomment-121610027.

Relequestual commented 9 years ago

@ekg Beacon is the simplest of API's that GA4GH has specified. Yet, the average developer on the street probably wouldn't know where to start. I feel this is a bad position to be in. I'd be happy to help out with documentation, but I actually don't know myself!

diekhans commented 9 years ago

Since Beacon is currently completely independent of the DWG API, they really should be in the own repo where they can move at their own speed and methodologies. If they decide they want to be more integrated into the DWG API, then they can be moved in and made consistent.

Right now, it's just confusing to everyone.

Erik Garrison notifications@github.com writes:

I guess we should just use standard techniques for communicating these. For the Beacon API, just write a document and provide example code (perhaps inline). Precise specifications can follow.

On Wed, Jul 15, 2015 at 2:10 PM, Ben Hutton notifications@github.com wrote:

I attempted to find out how to implement the "current" beacon API. I genuinly am unable to work it out. There is an avro schema, and no documentation that explains how I should use it, nor any examples. As someone looking into beacon for the first time, I'd be at a total loss as to what I should do.

— Reply to this email directly or view it on GitHub https://github.com/ga4gh/schemas/issues/256#issuecomment-121610027.

— Reply to this email directly or view it on GitHub.*

Relequestual commented 9 years ago

Sounds reasonable to me. Who can make that happen?

benedictpaten commented 9 years ago

We are working on this.

On Wed, Jul 15, 2015 at 7:36 AM, Ben Hutton notifications@github.com wrote:

Sounds reasonable to me. Who can make that happen?

— Reply to this email directly or view it on GitHub https://github.com/ga4gh/schemas/issues/256#issuecomment-121635599.

ttriche commented 9 years ago

I'll reiterate what I said long ago: if the foundational avro schema (or perhaps protobufs?) can be turned into a reasonably clear figure with reasonably clear prose justification, written by a human and introducing the figure, then a reasonable human will be more likely to implement the proposed APIs.

I tried, and failed, to generate accurate figures from the schemas, because I couldn't find any suitable tools (and I'm not getting paid to write a new one). If this could be done, the Beacon API is the best candidate to demonstrate how. This is a recurrent problem in GA4GH that could benefit from a graceful solution (in fact I suspect it's a widely recurrent problem far beyond GA4GH).

If it can't be solved gracefully, maybe there is a deeper issue involved. I'd gladly address this myself, but previous attempts convinced me that automatic drawings of Avro schemas were not (yet?) a common request. I'd also like to implement a beacon that balances the privacy and dignity of our patients and donors with a more open approach to sharing data, and at the moment, it's not very clear to me how best to do that. (That may be my fault.)

best,

--t

On Wed, Jul 15, 2015 at 7:29 AM, Mark Diekhans notifications@github.com wrote:

Since Beacon is currently completely independent of the DWG API, they really should be in the own repo where they can move at their own speed and methodologies. If they decide they want to be more integrated into the DWG API, then they can be moved in and made consistent.

Right now, it's just confusing to everyone.

Erik Garrison notifications@github.com writes:

I guess we should just use standard techniques for communicating these. For the Beacon API, just write a document and provide example code (perhaps inline). Precise specifications can follow.

On Wed, Jul 15, 2015 at 2:10 PM, Ben Hutton notifications@github.com wrote:

I attempted to find out how to implement the "current" beacon API. I genuinly am unable to work it out. There is an avro schema, and no documentation that explains how I should use it, nor any examples. As someone looking into beacon for the first time, I'd be at a total loss as to what I should do.

— Reply to this email directly or view it on GitHub https://github.com/ga4gh/schemas/issues/256#issuecomment-121610027.

— Reply to this email directly or view it on GitHub.*

— Reply to this email directly or view it on GitHub https://github.com/ga4gh/schemas/issues/256#issuecomment-121633791.

Relequestual commented 9 years ago

@benedictpaten OK. Who is "we" and is there opportunity for others to help? There doesn't seem to be much visibility on what is happening and what isn't. On a side-bar, I'm starting to feel the new ga4gh technical website is a step backwards in terms of usefulness.

maximilianh commented 9 years ago

@benhutton: does this help? https://github.com/maximilianh/ucscBeacon

On Thu, Jul 16, 2015 at 11:00 AM, Ben Hutton notifications@github.com wrote:

@benedictpaten https://github.com/benedictpaten OK. Who is "we" and is there opportunity for others to help? There doesn't seem to be much visibility on what is happening and what isn't. On a side-bar, I'm starting to feel the new ga4gh technical website is a step backwards in terms of usefulness.

— Reply to this email directly or view it on GitHub https://github.com/ga4gh/schemas/issues/256#issuecomment-121887037.

diekhans commented 9 years ago

We is UCSC, a preliminary version of documentation is here:

http://hgwdev.cse.ucsc.edu/~jeltje/build/html/introduction.html

The documentation is the schema is also being updated. We will have a avro sphinx plugging to link things up.

We are waiting on the linear branch to be accepted before starting to merge all of this and doing a PR.

Once we have that in place, we will be cornering the task teams to help them round out their documentation.

Ben Hutton notifications@github.com writes:

@benedictpaten OK. Who is "we" and is there opportunity for others to help? There doesn't seem to be much visibility on what is happening and what isn't. On a side-bar, I'm starting to feel the new ga4gh technical website is a step backwards in terms of usefulness.

— Reply to this email directly or view it on GitHub.*

Relequestual commented 9 years ago

@diekhans OK, thankyou. That makes sense.

@maximilianh A bit, but I guess I'm coming from the false assumption that people will be writing APIs into existing systems as opposed to creating standalone systems. We (Decipher) would want to make it part of our existing code base, not run a new process. It is still useful to be able to view an implementation though, so thanks for the link! =]

maximilianh commented 9 years ago

@benhutton: my assumption was that people have usually apache running anyways and have their variants as .VCF files somewhere. In this case, all they have to do is to extract the archive and import their VCF files.

On Thu, Jul 16, 2015 at 1:59 PM, Ben Hutton notifications@github.com wrote:

@diekhans https://github.com/diekhans OK, thankyou. That makes sense.

@maximilianh https://github.com/maximilianh A bit, but I guess I'm coming from the false assumption that people will be writing APIs into existing systems as opposed to creating standalone systems. We (Decipher) would want to make it part of our existing code base, not run a new process. It is still useful to be able to view an implementation though, so thanks for the link! =]

— Reply to this email directly or view it on GitHub https://github.com/ga4gh/schemas/issues/256#issuecomment-121938389.

benhutton commented 9 years ago

hey @maximilianh I am @benhutton but I am NOT the person you are trying to talk to. I think you're trying to talk to @Relequestual, a different Ben Hutton (nice to meet you!)

maximilianh commented 9 years ago

sorry @benhutton (was replying by email where usernames are not shown), tagging the other ben hutton now @Relequestual

Relequestual commented 9 years ago

haha classic. @maximilianh Maybe that's a safe assumption for most cases (I don't know, I'm still new ish to the biology field), but that isn't the case for us (Decipher). We have variants in a database format. Not ALL of a patients vairants, only the ones of interest. Deposited by various projects around the world.