hkir-dev commented 4 months ago

Story: #100

Description

What each table mean -Basic Editing
- How to edit parent cluster
- Cell ontology term
- Add references
- ...

As well as lookup type documentation we should also have workflow docs with screenshots.

dosumis commented 4 months ago

We have some of this already in UI Guide - needs updating.

AvolaAmg commented 4 months ago

Here there is the updated documentation on the tables. The documentation on the basic editing with screenshots will be added on the next comment. The update of the UI interface will be done once this comment is approved.

Taxonomy Development Tools User Interface Guide

Welcome to the Taxonomy Development Tools User Interface Guide. This document is designed to provide comprehensive details on navigating and utilizing the TDT interface efficiently. Whether you are looking to manage data tables, edit information, or leverage advanced features, this guide will assist you in making the most out of TDT.

Tables
1. Switch system tables 2.User tables
Table Management
Actions
Views

Tables

At the heart of the Taxonomy Development Tools is a robust internal database designed to streamline the management and curation of taxonomy-related data. Access to this database is facilitated through a user-friendly interface, with tables being a central component.

To view the available tables, navigate to the Tables dropdown menu at the top of the interface.

Pasted Graphic 3

TDT categorizes tables into two main types, switch system tables and user tables, each serving distinct purposes:

Switch system tables: these tables are essential for the internal configuration of the TDT and cannot be modified by the users.

table: this lists all the tables present in the TDT and it appears in the default page of the TDT

Pasted Graphic 4

column: this table contains all the columns present in each table.

Pasted Graphic 1

-message: this table contains all the messages present one very row of each table.

Pasted Graphic 5

User tables

User tables are created when data is uploaded to the TDT using the load_data operation (https://brain-bican.github.io/taxonomy-development-tools/Curation/). This data is formatted according to the Cell Annotation Schema and organized into multiple interrelated tables.

Example: the nhp_basal_ganglia_taxonomy present an annotation table named AIT115_annotation_sheet from this table a series of user tables are generated and displayed in the TDT.

The user tables are the following:

original data table with author annotation

Exp. AIT115_annotation_sheet.tsv

*_metadata: This table contains all the medatadata related to the taxonomy. For full specifications of the metadata properties, look up the cell annotation schema documentation under the section properties. The *_metadata column names are explained below:

author name : the name of the first author of the taxonomy. author contact : author's email. author list: name of secondary authors. matrix file ID: a resolvable ID for a cell by gene matrix file. cellannotation schema version: the version of the cell annotation schema. cellannotation timestamp: the time (yyyy-mm-dd) of when the cell annotations are published. cellannotation url: a URL where all cell annotations are published for each dataset.

Pasted Graphic 6

*_labelset: This table contains the definition of the labels used in the annotation and the methodology used to acquire those labels. Full specifications of the label set can be found in the Cell annotation schema documentation under the labelsets section. name : the name of the type of annotation key description : description of the annotation key rank : the level of granularity of the annotation with 0 being the most specific annotation method : the method used for the type of annotation, it can either be algorithmic, manual or both automated annotation algorithm name : the name of the algorithm used for the automated annotation automated annotation algorithm verision : the version used for the algorithm automated annotation algorithm repo url : a resolvable URL of the version control of the algorithm used. automated annotation reference location : a resolvable URL of the source of the data.

Pasted Graphic 7

*_annotation: Stores annotations for cell types, classes, or states, along with supporting evidence and provenance information. It is designed to be flexible, allowing for additional fields to accommodate user needs or project-specific metadata. Further information on the annotation columns can be found in the Cell annotation schema documentation under the annotations section.

cell set accession : an identifier that can be used to consistently refer to the set of cells being annotated, even if the cell_label changes. cell label : the cell annotation provided by the author. cell fullname : the full-length term of the annotated cell set. parent cell set accession : similar to the cell set accession, this is the term for a set of cells on step higher than the cells in the row in the hierarchical classification. labelset : the type of cell annotation from the AnnData/Seurat file. cell ontology term id : the ontology term ID that define the cell type. I has to be the closest term matching the cell label cell ontology term : the ontology term name from the ontology term ID rationale : The short name of the publications used to define the cell ontology term. rationale dois : The DOI of the paper mentioned in the rationale maker gene evidence : List of names of genes whose expression in the cells being annotated is explicitly used as evidence for this cell annotation. Each gene MUST be included in the matrix of the AnnData/Seurat file. synonyms : synonyms of the cell label Supertype : region.info Frequency : Cluster size : The number of cells present in that cluster. Gene counts : The number of genes detected in the cluster. UMI counts : The number of UMI detected in the cluster. AIT21 ABC atlas subclass homology : The homologous term to cell label present in the Allen Brain Cell Atlas. Binary genes : Genes expressed in the cluster. NSForest markers combo : A set of genes obtained using the NS-Forest machine learning algorithm to identify clusters. NSForest F1 score: Curated markers: Comments:

Pasted Graphic 9

*_annotation_transfer: Tracks annotation transfer records. I need some help for this part @dosumis maybe we could discuss about it. For detailed information on table structures and fields, refer to the Cell Annotation Schema documentation.

AIT115_annotation_sheet_annotation_transfer

dosumis commented 4 months ago

Can you turn this into a PR on https://github.com/brain-bican/taxonomy-development-tools/blob/main/docs/UserInterface.md ? I can then make comments in PR review.
Screenshots will need to be here: https://github.com/brain-bican/taxonomy-development-tools/tree/main/docs/images/screenshots - we can swap them out as TDT evolves. Giving them clear names will help.
You don't need to add indexes manually - the doc system builds them (although editing the index doc to add some description of the contents of various docs would be useful.
These are all user specified fields, not part of the standard but specified in the informal taxonomy for basal ganglion. So - doc doesn't belong here (but it may be useful to document these specifically on the Basal Ganglion taxonomy repo, once that is finalised):

Supertype : region.info Frequency : Cluster size : The number of cells present in that cluster. Gene counts : The number of genes detected in the cluster. UMI counts : The number of UMI detected in the cluster. AIT21 ABC atlas subclass homology : The homologous term to cell label present in the Allen Brain Cell Atlas. Binary genes : Genes expressed in the cluster. NSForest markers combo : A set of genes obtained using the NS-Forest machine learning algorithm to identify clusters. NSForest F1 score: Curated markers: Comments:

dosumis commented 4 months ago

@hkir-dev - is the doc build automatic or do we need to run it via a make command?

hkir-dev commented 4 months ago

We have a GitHub actions for this: Actions > Publish mkdocs documentation

Action can be triggered manually or it is triggered automatically with the release.

brain-bican / taxonomy-development-tools

Documentation about tables and basic editing #104

Description

Taxonomy Development Tools User Interface Guide

Tables