DataBiosphere / data-browser

Apache License 2.0
11 stars 4 forks source link

"Specimen" sample type is uninformative #863

Open ambrosejcarr opened 5 years ago

ambrosejcarr commented 5 years ago

The tooltips that occur when you hover over the browser are very helpful!

However, "Specimens", is not an informative name for what I assume is a tissue extract. I needed to look at the second page of studies to find some labeled "Cell lines" and "Organoids" in order to understand what "specimen" might mean by comparison. I would encourage tissue to be broken up by how it was preserved. Fresh, Frozen, Paraffin embedded, etc will have different signatures.

As a user with a keyboard or pipette attempting to compare to my data to data in the DCP, I want to be able to find data derived from the same tissue preparation, as I know this has a very significant effect on data quality and the phenotypic signature of the cells.

Second, the "Sample Type" field assumes knowledge of the mechanism of single-cell sequencing (what's a cell suspension?) and I would suggest it be simplified to: "Sample Type: The type of biomaterial containing the analyzed cells. Will be one of cell line, organoid, or tissue".

This will also generalize better when we have imaging data, which does not have a cell suspension

hannes-ucsc commented 4 years ago

The tooltips that occur when you hover over the browser are very helpful!

However, "Specimens", is not an informative name for what I assume is a tissue extract. I needed to look at the second page of studies to find some labeled "Cell lines" and "Organoids" in order to understand what "specimen" might mean by comparison.

What should "Specimen" be changed to? It's worth keeping in mind that the Data Browser team does not decide on the nomenclature. The metadata team does. We can cosmetically change names for display but that would create inconsistency with the metadata nomenclature. The official name is actually specimen_from_organism which we took the liberty to shorten.

Also note https://github.com/HumanCellAtlas/metadata-schema/issues/991.

I would encourage tissue to be broken up by how it was preserved. Fresh, Frozen, Paraffin embedded, etc will have different signatures.

It sounds like you are referring to the state_of_specimen, preservation_storage and collection_time properties defined in specimen_from_organism. Which of these should we expose in the Data Browser?

Second, the "Sample Type" field assumes knowledge of the mechanism of single-cell sequencing (what's a cell suspension?) and I would suggest it be simplified to: "Sample Type: The type of biomaterial containing the analyzed cells. Will be one of cell line, organoid, or tissue".

This will also generalize better when we have imaging data, which does not have a cell suspension

That should be easy to do but I think we may want to factor it out into a separate ticket, @theathorn, @NoopDog.

ambrosejcarr commented 4 years ago

What should "Specimen" be changed to?

"Primary Tissue", I suspect, based on the solution proposed in https://github.com/HumanCellAtlas/metadata-schema/issues/991:

I would like the Data Browser to be able to display the "type" of sample in a project as either "Organoid", "Cell line", or "Primary tissue", which are the 3 major categories of sample types we have

This proposal would fix the problem I observed. 👍

Adding a combination of state_of_specimen and preservation_storage would be useful for users trying to match a particular preservation method, but that wouldn't be a universal use case and I wouldn't be confident advocating for that change without consulting UX.

theathorn commented 4 years ago

@lauraclarke I think this would require the metadata change to replace the value "specimens" with "Primary tissue".

hannes-ucsc commented 4 years ago

We can change the display from "specimen" to "primary tissue" before of the renaming of the schema occurs, as long as MD commits to renaming the schema in the foreseeable future.

lauraclarke commented 4 years ago

This hasn't been discussed recently, happy to start the discussion in this quarter and consider the outcome to be implemented most likely in Q2. We need to start the discussion, can you create the issue in the metadata schema repo (or find the old discussion) and we can resolve it and make a plan to move forward

hannes-ucsc commented 4 years ago

Sounds good, given that there is the intent to rename the human readable name and possibly the metadata schema and entity, I think we can anticipate that and rename the entity in the presentation layer e.g., the Data Browser.

[edit: added reference to human readable name]

hannes-ucsc commented 4 years ago

The old discussion is here: https://github.com/HumanCellAtlas/metadata-schema/issues/991, which currently blocking this ticket but I think we can unblock.