clingen-data-model / data-exchange-shared-json

JSON schema utilized to share data from the curation interfaces into the Data Exchange. CONTAINS GENE EXPRESS JSON DATA
1 stars 0 forks source link

Affiliation ID structure #25

Closed sgoehringer closed 5 years ago

sgoehringer commented 7 years ago

@wrightmw We probably need to start thinking about the naming/id convention for curation affiliations. I don't think it should be as simple number like 1 or 2 or anything too random because it may be possible for the different curation tools auto generate something that is already used.

My sense is the affiliates would fall into the following general areas...

Based on the assumption above, it feels affiliations will be associated to a specific and single curation effort. If that is true, a Affiliation ID can't be shared by multiple curation groups... IE. a Gene Curation Committee and a Variant Expert Panel will need separate IDs. Don't you think think so? So maybe something like... (Just throwing out ideas)

Let me know if you think I out in left field and I am open to all ideas!

wrightmw commented 7 years ago

Hi @sgoehringer

What you are describing here are more similar to affiliation aliases. We prefer IDs to be namespace agnostic, so we would recommend that the AffilationIDs are just numbers. Using namespace as part of an ID can cause problems further down the line. For instance, within your schema there may be one curation group that submits curations to both the GCI and VCI.

@jimmyzhen Please feel free to add any comments about how this impacts on our schema.

sgoehringer commented 7 years ago

@wrightmw I totally understand... if we go with straight ID (like 1, 2, 3) then we need to make sure that all of the curation teams are on the same page so same ID isn't shared by multiple affiliations. We could go with an IRI concept and get ahead of things since this most-likely will hang around for a long time. Glad to are having the discussions and curious what @jimmyzhen and @tnavatar think.

jimmyzhen commented 7 years ago

@sgoehringer,

Can you give us an example of the ID with an IRI concept (that you may have in mind)? Thanks.

sgoehringer commented 7 years ago

Sure, it was earlier on the thread but email doesn't show that... anyway. here you go...

cdwgGene1 = ABC Gene Curation Committee cdwgVariant1 = ZYX Variant Curation Expert Panel awgActionability1 = Clinical Actionability gcwgDosage1 = Dosage Sensitivity

An IRI one... I haven't taken a lot of time to think about it but... http://www.clinicalgenome.org/affiliation/1 or http://www.clinicalgenome.org/affiliation/cdwgGene1

The benefit of the IRI is when other groups start up we can request (or software can check) the IRI is in use and/or matches what they were expecting. Again, just brainstorming on the fly a little.

jimmyzhen commented 7 years ago

@sgoehringer,

Thank you for the clarification.

Would it be potentially problematic if we assign numeric-only IDs, something like 5-digit IDs ranging from 10000-99999?

I can't help but associate strings like "cdwgGene1" with aliases or symbols.

wrightmw commented 7 years ago

I agree with @jimmyzhen The AffiliationIDs only need to be unique. The simplest implementation would be to use numbers. We can hook other information, such as Group or Lab names, domain area, etc. to the AffiliationID.

tnavatar commented 7 years ago

Numeric is fine, so long as you prefix the numeric ID with your domain (just to make it clear who owns the identifier:

Here’s an example of an OK IRI: https://www.ncbi.nlm.nih.gov/clinvar/submitters/505720/

On Oct 4, 2017, at 5:31 PM, Matt W. Wright notifications@github.com wrote:

I agree with @jimmyzhen https://github.com/jimmyzhen The AffiliationIDs only need to be unique. The simplest implementation would be to use numbers. We can hook other information, such as Group or Lab names, domain area, etc. to the AffiliationID.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/clingen-data-model/data-exchange-shared-json/issues/25#issuecomment-334295582, or mute the thread https://github.com/notifications/unsubscribe-auth/AAXWMPyVY01H6yZPunZm-WYPsRPgPG_fks5so_kngaJpZM4Pt6pq.

wrightmw commented 7 years ago

@sgoehringer On the large GCI call today, Jenny Goldstein ( @jennygoldstein ) presented the plans for UNC to develop a curator tracking system. Within this tracking system Affiliations would have to be specified so that the workflow of working groups could be tracked. Our team also needs AffiliationIDs in place for when we implement Group login for the interfaces, which will be soon because this feature is planned for the next release (the one after our release this week). Therefore, I think we need to think about generating the AffiliationID list sooner rather than later. The coordinators have a list on confluence of all the working groups. This list does not have be comprehensive right now, we just need to know the format.

sgoehringer commented 5 years ago

Cleanup - Closing old issue.