airr-knowledge / issues

Issues and project management for the AKC
0 stars 0 forks source link

Generate/document list of data standards (and related technologies) being used by each repository #16

Closed bcorrie closed 5 months ago

bcorrie commented 7 months ago
bcorrie commented 7 months ago

I had this on the agenda for today's meeting, but we didn't get there, so I though I would create this issue and ask the question here:

What do we mean by this? Are we talking DB technologies (e.g. Mongo)? API implementation technologies (e.g. PHP/Laravel)? Standards used for describing specs (e.g. YAML)? Communication protocols (e.g. HTTP)? Data exchange technologies (e.g. JSON, TSV)

schristley commented 7 months ago

What do we mean by this? Are we talking DB technologies (e.g. Mongo)? API implementation technologies (e.g. PHP/Laravel)? Standards used for describing specs (e.g. YAML)? Communication protocols (e.g. HTTP)? Data exchange technologies (e.g. JSON, TSV)

Actual standards like JSON schema, OpenAPI V3, AIRR standards, OWL, SQL, etc., so yes to everything above. I wouldn't go too crazy with implementation technologies, we mostly want the ones that will be involved in curation/extraction/annotation/validation processes or as part of the integration process as data flows from the repositories into the AK.

These will help drive the implementation technology decisions for the AK, so we can say, we need something that will support X, Y, and Z. Each repository might want to evaluate their technology stack to optimize their processes, or e.g. in the case of IRAD, the service is being designed (mostly) from scratch.

bcorrie commented 7 months ago

I created a document Standards and Technologies as a place to document this info. It is in the Google Drive.

I have added AIRR Data Commons and iReceptor specific information to the doc. If other could do the same that would be great. Not sure of the granularity we want, I just did a bit of a brain dump for starters.

@schristley I separated out iReceptor and VDJServer in this case, so we need a VDJServer section @rvita @bpeters42 can you add IEDB info @williamdlees can you add OGRDB/VDJBase @krishnaroskin @KevinABurns137 same for IRAD

schristley commented 7 months ago

Thanks @bcorrie for the thorough template. I've added info for VDJServer.

rvita commented 7 months ago

I looked at this doc and it is very technical. A different IEDB team member needs to complete this. @bpeters42 https://github.com/bpeters42 should say who.

On Thu, Nov 9, 2023 at 11:52 AM Brian Corrie @.***> wrote:

I created a document Standards and Technologies as a place to document this info. It is in the Google Drive.

I have added AIRR Data Commons and iReceptor specific information to the doc. If other could do the same that would be great. Not sure of the granularity we want, I just did a bit of a brain dump for starters.

@schristley https://github.com/schristley I separated out iReceptor and VDJServer in this case, so we need a VDJServer section @rvita https://github.com/rvita @bpeters42 https://github.com/bpeters42 can you add IEDB info @williamdlees https://github.com/williamdlees can you add OGRDB/VDJBase @krishnaroskin https://github.com/krishnaroskin @KevinABurns137 https://github.com/KevinABurns137 same for IRAD

— Reply to this email directly, view it on GitHub https://github.com/airr-knowledge/issues/issues/16#issuecomment-1804548589, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADKDY43PEYP46IX5DJH25VTYDUYBJAVCNFSM6AAAAAA7DNIBP6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMBUGU2DQNJYHE . You are receiving this because you were mentioned.Message ID: @.***>

-- Randi Vita, M.D. Lead Ontology and Quality Manager Immune Epitope Database and Analysis Project La Jolla Institute for Allergy & Immunology 9420 Athena Circle La Jolla, Ca 92037 @.*** www.immuneepitope.org 858-752-6912

bpeters42 commented 7 months ago

The IEDB is a smorgasbord of technologies, given the 2 decades of development. Describing all of it would have no benefit for this project - so the focus should be solely on (as Scott said) "that will be involved in curation/extraction/annotation/validation processes or as part of the integration process as data flows from the repositories into the AK". I hope @jamesaoverton can add that.

schristley commented 7 months ago

The IEDB is a smorgasbord of technologies, given the 2 decades of development. Describing all of it would have no benefit for this project - so the focus should be solely on (as Scott said) "that will be involved in curation/extraction/annotation/validation processes or as part of the integration process as data flows from the repositories into the AK". I hope @jamesaoverton can add that.

Hi @jamesaoverton , and this is tied together with #7 and #8. We can discuss this further in the V&A KG meeting.

schristley commented 6 months ago

IEDB, OGRDB and VDJbase have documents in the Validation and Automation

schristley commented 6 months ago

@krishnaroskin @KevinABurns137 Are you guys actively using any standards as part of IRAD? Or do you consider IRAD to be more of a prototype with standards usage being in flux? If the latter then let us know and/or add a small section to the Standards and Technologies document so we can close off this issue.

schristley commented 6 months ago

IRAD section has been added.

schristley commented 5 months ago

@williamdlees Hi William, I just realized that we didn't have a section for OGRDB/VDJbase in the Standards and Technology document. If you could, please add a section. Take a look at VDJServer for the level of detail desired. Mainly some information on the tech stack and what standards are being used.

williamdlees commented 5 months ago

Sorry Scott. Have added sections now.

schristley commented 5 months ago

thanks William!