Closed ubyndr closed 5 months ago
@evanbiederstedt - rather than add these CAP specific fields, for most or all fields can we use the generic equivalents and move the cap business logic (e.g. "NOTE: the term \"publication\" refers to the workspace published on CAP with a version and timestamp.") into the specification of how to CAP should write to the generic fields. This could live a separate field in the CAP extension.
I suspect one major issue is around making sure CAP has credit, maybe we could do this with a separate field for recording annotation tool - this could be used by our Taxonomy Development Tools and Cytosplore (both for BICAN).
Also - I think the fields @ubyndr has copied across don't yet reflect the CAP mechanism for dealing with multiple authors.
Ping @evanbiederstedt - can we reconcile? See comments above.
@evanbiederstedt comment on slack:
"Yes, let's remove the CAP prefixes."
So I think the next step is to remove these and merge fields whose names are otherwise identical. If the CAP definition doesn't say anything substantive over the CAS def other than mentioning , the CAP def should be discarded. Anything else we can discuss.
Given the schema release, @rm1113 would be in the best position to review this
@mfutey please check how updated the field descriptions are as well if you could
Rolled schema files: schemas.zip
@evanbiederstedt @dosumis
We still have the cap_
prefixes in the CAP AnnData file schema.
workspace_title = "cap_publication_title"
workspace_description = "cap_publication_description"
workspace_url = "cap_publication_url"
authors_list = "cap_publication_authors_list"
publication_timestamp = "cap_publication_timestamp"
publication_version = "cap_publication_version"
main_author = "cap_author_name"
main_author_orcid = "cap_author_orcid"
main_author_contact = "cap_author_contact"
I worry the user will be confused with fields like publication_title
because it when I see this field the first thing I am thinking about the paper title but not some project title on the data portal. Maybe let's discuss each field one by one?
@ubyndr Related issue & discussion: https://github.com/cellannotation/cell-annotation-schema/issues/43#issuecomment-1900492569
We'll make modifications as appropriate
@ubyndr @evanbiederstedt I reviewed the descriptions and they look good to me but as noted above this needs to be updated to reflect changes to the schema:
These are in the CAP extension:
canonical_marker_genes category_fullname category_cell_ontology_exists category_cell_ontology_term_id category_cell_ontology_term
This is TBA:
authors_list - see #41
Implemented updates based on the latest feedback and comments.
Remove the author fields from this PR. They don't correspond to current names in https://github.com/cellannotation/cell-annotation-schema/blob/main/docs/cap_anndata_schema.md. We should fix in the general schema ASAP. After that we can merge.
Added CAP specific fields from https://github.com/cellannotation/cap_file_planning/blob/main/cap_anndata_schema.md#cap-encoding-for-anndata-file to CAP_extension.json.
I have couple of question regarding with the changes;
How do I handle
required upon publication
property of the fields?The following file is generated when the extension and the general schema is merged;
Fields like
author_name
,author_contract
andauthor_orcid
seems like redundant. Is this acceptable?There are some fields in https://github.com/cellannotation/cap_file_planning/blob/main/cap_anndata_schema.md#cap-encoding-for-anndata-file that do not exist in the general schema, am I suppose to add those fields to the extension as well?