airr-community / airr-standards

AIRR Community Data Standards
https://docs.airr-community.org
Creative Commons Attribution 4.0 International
35 stars 23 forks source link

Start of a changelog for v1.4 #602

Closed bcorrie closed 2 years ago

javh commented 2 years ago

I thought I disabled Travis on master a couple months ago. Let me take a look at it.

javh commented 2 years ago

Looks like this was probably caused by https://github.com/travis-ci/travis-ci/issues/10204

I just removed the Travis CI status check entirely.

bcorrie commented 2 years ago

Hi all, not sure how complete we want this to be and how careful we want to look. I did a diff/merge from v1.3 to master to get the list I made.

scharch commented 2 years ago

I guess I'll echo @bcorrie: how complete and careful should this be? I think that most people wouldn't find what's here particularly helpful, except insofar as it tells them what to search for in other parts of the docs. Wouldn't it be better to group the new schemas by category or something? Why do only a couple actually have a description? And for the things that have changed, we should probably say how...

bcorrie commented 2 years ago

Why do only a couple actually have a description? And for the things that have changed, we should probably say how...

Mostly due to time constraints... And the lack of clarity as to how detailed we want to be. The ones with descriptions were from an earlier version. The ones without a description are essentially the shopping list I created when I did the v1.3 - master diff. Doing a good job will take some time, but it should probably be done. After the AIRR Meeting though 8-)

javh commented 2 years ago

My opinion on release notes is that they should be informative to the user, but not burdensome to interpret or read, and details that do not impact the user should not be listed. I prefer sentences/descriptions over a list of changes. Ie, tell me what I need to know, but don't show me all your work.

I think we should either (a) stick to the exact structure/style/voice in the v1.3.0 release notes or (b) refactor all prior release notes to the same style if go with something new. Laziness favors (a), but I don't object to (b).

My preference is for the original text:

Rearrangement Schema:

1. Added the optional fields ``v_frameshift``, ``j_frameshift``,
   ``d_frame`` and ``d2_frame`` defining annotations related to alignment
   reading frames.

Instead of the new text:

- v_frameshift: annotation field related to alignment reading frames.
- j_frameshift: annotation field related to alignment reading frames.
- d_frame: annotation field related to alignment reading frames.
- d2_frame: annotation field related to alignment reading frames.

The above is also in the wrong section (Alignment instead of Rearrangement), but I'll fix that after we decide on a format.

bcorrie commented 2 years ago

I took a stab at some of the objects and a few fields.

javh commented 2 years ago

@bcorrie can do. Though, I'll wait until we decide on a structure. Redoing everything as a list instead of descriptive text is a lot of effort - rather not do it twice.

javh commented 2 years ago

In the interest of getting this done, I pulled the trigger on the style changes. I didn't get to the germline or single-cell notes yet, but you can review the notes here: https://docs.airr-community.org/en/v1.4-changelog/news.html#schema-release-notes

I'll try to take care of the germline and single-cell stuff after today's meeting.

bcorrie commented 2 years ago

@javh the main difference for SingleCell is that expression_tabular, which was in v1.3 is not in v1.4, the expression table is external and implemented through the CellExpression object. We also added the adc_annotation_cell_id

Of course CellExpression and Receptor are completely new between 1.3 and 1.4

javh commented 2 years ago

Thanks, @bcorrie. IIRC, adc_annotation_cell_id is supposed to be in the ADC extensions instead (specs/adc-api-openapi3.yaml), correct? I assume that applies to adc_publish_date and adc_update_date in Study as well.

bcorrie commented 2 years ago

No, we need these to be in the actual data model. They are not MiAIRR fields, as they are ADC specific (hence the adc_ prefix), but they need to be fields in the data model, as they need to be stored in the repositories... If they just go in the ADC API spec, they are transient and computed field, but these need to be stored.

javh commented 2 years ago

Why does it need to be in the main schema instead of living where the other adc_ fields do (eg, adc_v_gene and whatnot)? Pretty sure we agreed to do extension thing.

Edit: To clarify, I see good a argument for adc_publish_date/adc_update_date being in the main spec (like pub_ids), if our plan to is recommend deposit in the ADC moving forward. My concern with those is more about whether they are properly placed. But, that doesn't apply to adc_annotation_cell_id, which is an ADC custom field.

bcorrie commented 2 years ago

@javh the API spec files are API specification only (query/response) specifications, they aren't data model specifications.

We want to keep the data model and API separate with the API based on the data model. So if the API returns a field in the JSON response (something that can be computed from the data model - e.g. adc_v_gene) that isn't in the data model, it can "just" live in the API spec. But if a field needs to be persistent and stored in the repository (adc_annotation_cell_id, adc_publish_date, adc_update_date) then it needs to be in the data model.

bcorrie commented 2 years ago

To clarify, I see good a argument for adc_publish_date/adc_update_date being in the main spec (like pub_ids), if our plan to is recommend deposit in the ADC moving forward. My concern with those is more about whether they are properly placed. But, that doesn't apply to adc_annotation_cell_id, which is an ADC custom field.

adc_publish_date, adc_update_date are definitely ADC custom fields - they denote when the data was published and updated in the repository.

javh commented 2 years ago

Eek. Okay, how about this then? Because putting ADC custom fields in the main schema isn't what we agreed on, and that was an agonizingly long discussion, let's move them to the extensions as a placeholder so we can release v1.4 without having more discussion. I'll open a PR for that.

It seems like we might need to rethink how these fields are handled, so let's deal with that after v1.4? As they are all clearly denoted adc_ putting them in the main schema probably won't confuse anyone. But, I'm not sure it's wise to wed the ADC API release cycle to the glacial release cycle of the main schema, especially because we're hoping to stop development in two more releases (v2.1) which would effectively freeze the ADC API custom fields. It's not clear to me what's best here, but I doubt it's solvable without more time/effort.

javh commented 2 years ago

This looks ready to merge from my perspective: https://docs.airr-community.org/en/v1.4-changelog/news.html#schema-release-notes