airr-community / airr-standards

AIRR Community Data Standards
https://docs.airr-community.org
Creative Commons Attribution 4.0 International
35 stars 23 forks source link

Germline doc #593

Closed williamdlees closed 2 years ago

williamdlees commented 2 years ago

This closes multiple points raised in #559 and #562 - see comment in those threads, and the change notes below. In addition I have raise two new minor issues #591 and #592 so that #559 and #562 can be closed once this pull request is accepted and merged.

williamdlees commented 2 years ago

Also now addresses #556 and #572

williamdlees commented 2 years ago

And is also intended to close #590

javh commented 2 years ago

I fixed the R check error and went over the changes. This looks good overall.

My only concern is the various _description fields. I was originally thinking they would just be named description within each object, rather than something like Rearranged:: rearranged_sequence_description. It looks weird no matter what, especially when release_description is in the same object.

I'm really not sure what to do with those fields. What do you think about naming them something like curation_description?

williamdlees commented 2 years ago

germline_description_description doesn't trip off the tongue. Changed the _descriptions to curational_notes, hope that's ok

javh commented 2 years ago

I think if the only place we ever plan to use _notes is in the germline objects, then I think that's fine. If we think we might add similar fields outside these objects then I'd like to avoid (well, try to avoid) any future semantic inconsistencies. Maybe we can do a quick informal poll?

Some suffix suggestions:

Prefix ideas:

I might be a bit overly cautious about future-proofing here, but... thoughts?

scharch commented 2 years ago

I kinda like just comments A prefix/suffix seems like overkill unless we think we will also have some other kind of _notes (or some other curation_, I suppose)

bussec commented 2 years ago

@williamdlees I updated the sections of the Schema that were changed by #524, #574 and #603, but excluded the (non-MHC) germline objects. The merge to master should now work again, the only problem is that the R test is failing again (@javh, any ideas)?

javh commented 2 years ago

I just checked the R package locally and I'm not seeing the same Duplicate map key: 'InformationProvider' error from yaml::yaml.load_file. I'll have to fight with it.

javh commented 2 years ago

The R check failure was due to merging against master generating a duplicate copy of the InformationProvider object in the specs, which is why the error wasn't present locally. I'm not sure how that happened, but I blindly blame the force-push creating an improper merge conflict resolve.

Seems fixed now.

bussec commented 2 years ago

@javh Thanks for all the troubleshooting! I really hate to say it but you fixed the consistency check, but now the merge conflicts are back :cry: I did so testing locally and the order of the commits seems to matter, so let me check whether squashing solves the issue...

javh commented 2 years ago

Haha. Okay, well.. Let's hope. We may have to clean it up post-merge or something. I think we just have the "notes" fields to deal with for this PR, yes?

bussec commented 2 years ago

Finally! It did not require squashing, a rebase to the current master was sufficient. So know we got consistency checks, commit history and the ability to actually merge this :tada: Everyone who is on a local repo, please remember to update this branch before making any new commits.

bussec commented 2 years ago

@javh

I think if the only place we ever plan to use _notes is in the germline objects, then I think that's fine. [...] Maybe we can do a quick informal poll?

In general I hope that we will not need additional comments fields like this, as they might be abused as dumps for information that should go into other existing fields. However, I recognize that when representing/exchanging GLDB data there is a valid use case for comments on the curation process. So it is ok to make an exception here, but IMO we do not need to agree on a common suffix.

bussec commented 2 years ago

@javh Is there anything more we need to discuss regarding the curational_notes?

javh commented 2 years ago

Not really, I think you're right that it's unlikely to have a broader use case, so we just need to pick something. Any objections to just curation?

williamdlees commented 2 years ago

No objection from me.

javh edit: Removed email headers.

williamdlees commented 1 year ago

I kinda like just comments. A prefix/suffix seems like overkill unless we think we will also have some other kind of notes (or some other curation, I suppose)

This would be ok with me. The description already says ‘curational notes’ so it doesn’t really matter if it’s in the name. In my use case at least.

javh edit: remove email headers.