Closed hkir-dev closed 4 months ago
Here there is the updated documentation on the tables. The documentation on the basic editing with screenshots will be added on the next comment. The update of the UI interface will be done once this comment is approved.
Welcome to the Taxonomy Development Tools User Interface Guide. This document is designed to provide comprehensive details on navigating and utilizing the TDT interface efficiently. Whether you are looking to manage data tables, edit information, or leverage advanced features, this guide will assist you in making the most out of TDT.
At the heart of the Taxonomy Development Tools is a robust internal database designed to streamline the management and curation of taxonomy-related data. Access to this database is facilitated through a user-friendly interface, with tables being a central component.
To view the available tables, navigate to the Tables dropdown menu at the top of the interface.
TDT categorizes tables into two main types, switch system tables and user tables, each serving distinct purposes:
Switch system tables: these tables are essential for the internal configuration of the TDT and cannot be modified by the users.
table
: this lists all the tables present in the TDT and it appears in the default page of the TDTcolumn
: this table contains all the columns present in each table.-message
: this table contains all the messages present one very row of each table.
User tables
User tables are created when data is uploaded to the TDT using the load_data
operation (https://brain-bican.github.io/taxonomy-development-tools/Curation/). This data is formatted according to the Cell Annotation Schema and organized into multiple interrelated tables.
Example: the nhp_basal_ganglia_taxonomy present an annotation table named
AIT115_annotation_sheet
from this table a series ofuser tables
are generated and displayed in the TDT.
The user tables are the following:
original data table with author annotation
Exp. AIT115_annotation_sheet.tsv
*_metadata
: This table contains all the medatadata related to the taxonomy. For full specifications of the metadata properties, look up the cell annotation schema documentation under the section properties. The *_metadata
column names are explained below:author name
: the name of the first author of the taxonomy.
author contact
: author's email.
author list
: name of secondary authors.
matrix file ID
: a resolvable ID for a cell by gene matrix file.
cellannotation schema version
: the version of the cell annotation schema.
cellannotation timestamp
: the time (yyyy-mm-dd) of when the cell annotations are published.
cellannotation url
: a URL where all cell annotations are published for each dataset.
*_labelset
: This table contains the definition of the labels used in the annotation and the methodology used to acquire those labels. Full specifications of the label set can be found in the Cell annotation schema documentation under the labelsets section.
name
: the name of the type of annotation key
description
: description of the annotation key
rank
: the level of granularity of the annotation with 0 being the most specific
annotation method
: the method used for the type of annotation, it can either be algorithmic, manual or both
automated annotation algorithm name
: the name of the algorithm used for the automated annotation
automated annotation algorithm verision
: the version used for the algorithm
automated annotation algorithm repo url
: a resolvable URL of the version control of the algorithm used.
automated annotation reference location
: a resolvable URL of the source of the data. *_annotation
: Stores annotations for cell types, classes, or states, along with supporting evidence and provenance information. It is designed to be flexible, allowing for additional fields to accommodate user needs or project-specific metadata. Further information on the annotation columns can be found in the Cell annotation schema documentation under the annotations section.cell set accession
: an identifier that can be used to consistently refer to the set of cells being annotated, even if the cell_label changes.
cell label
: the cell annotation provided by the author.
cell fullname
: the full-length term of the annotated cell set.
parent cell set accession
: similar to the cell set accession
, this is the term for a set of cells on step higher than the cells in the row in the hierarchical classification.
labelset
: the type of cell annotation from the AnnData/Seurat file.
cell ontology term id
: the ontology term ID that define the cell type. I has to be the closest term matching the cell label
cell ontology term
: the ontology term name from the ontology term ID
rationale
: The short name of the publications used to define the cell ontology term
.
rationale dois
: The DOI of the paper mentioned in the rationale
maker gene evidence
: List of names of genes whose expression in the cells being annotated is explicitly used as evidence for this cell annotation. Each gene MUST be included in the matrix of the AnnData/Seurat file.
synonyms
: synonyms of the cell label
Supertype
:
region.info Frequency
:
Cluster size
: The number of cells present in that cluster.
Gene counts
: The number of genes detected in the cluster.
UMI counts
: The number of UMI detected in the cluster.
AIT21 ABC atlas subclass homology
: The homologous term to cell label
present in the Allen Brain Cell Atlas.
Binary genes
: Genes expressed in the cluster.
NSForest markers combo
: A set of genes obtained using the NS-Forest machine learning algorithm to identify clusters.
NSForest F1 score
:
Curated markers
:
Comments
:
*_annotation_transfer
: Tracks annotation transfer records. I need some help for this part @dosumis maybe we could discuss about it.
For detailed information on table structures and fields, refer to the Cell Annotation Schema documentation.Supertype : region.info Frequency : Cluster size : The number of cells present in that cluster. Gene counts : The number of genes detected in the cluster. UMI counts : The number of UMI detected in the cluster. AIT21 ABC atlas subclass homology : The homologous term to cell label present in the Allen Brain Cell Atlas. Binary genes : Genes expressed in the cluster. NSForest markers combo : A set of genes obtained using the NS-Forest machine learning algorithm to identify clusters. NSForest F1 score: Curated markers: Comments:
@hkir-dev - is the doc build automatic or do we need to run it via a make command?
We have a GitHub actions for this: Actions
> Publish mkdocs documentation
Action can be triggered manually or it is triggered automatically with the release.
Story: #100
Description
As well as lookup type documentation we should also have workflow docs with screenshots.