Open chaoran-chen opened 7 months ago
So I'm imagining subheadings for various things.
(unheaded) frontmatter:
Sequence info:
Additional metadata:
I'd be imagining the config yaml would have
detailsPage:
frontmatter:
isolate_name
insdc_id
Sequence_Info:
length:
etc.
Also, we should
useEffect()
so it doesn't delay page-load)See also #100 from July last year.
Here are a few ideas on how to improve the sequence page.
For author lists, we probably want something like journals do:
i.e. abbreviated author lists can be expanded on click. I imagine we need the same features for the datasets page. Authors might have an orcid or email associated. So we need something that renders a list of structured author data with optional features like links to orcid etc. Ingested data from NCBI is going to be messy, but a subset of these features will still work.
For the host, we can aggregate information like
Into a field that looks like Homo Sapiens (9606) ({surveillance,laboratory,pool})
and links to NCBI Taxonomy data base. There is probably a dictionary to look-up common names which would be very useful (in particular if we target internationalization at some point).
Another group of fields could be on virus, lineage/clade/serotype etc.
Yet another group of fields would be INSDC which would be based on the raw data in
I'd imagine a header INSDC and them something like
There will be several quality metrics like
And things like alignment length. The LANL HIV database for example includes little previews like this
(they actually put these into the table to search and browse).
For mutations, I would follow a similar approach to authors: truncated lists that by default only span one line. Mutations could be rendered as little badge which makes them easier to parse than plain text C87665T. One line could be nucleotide mutations, then one line for each for each gene/CDS. This way uses can quickly find mutations in a particular gene (most of the time, people only care about a specific gene. Alternatively, the amino acid mutations could have a drop down in which you select the gene of interest (with a sensible default for each pathogen).
Insertions and deletions can be handled similarly, though they are typically fewer.
This PR is quite a good template for similar improvements - it shows how to pipe through new config options from values.yaml (kubernetes) to website: https://github.com/loculus-project/loculus/pull/1442/files
I started looking into this, from a short discussion with @corneliusroemer and @bh-ethz it would appear best to split this milestone into a couple sub-tasks.
useEffect()
Update: Added the tasks to the description to use github's subtask feature.
Great idea to split it up in chunks! I've added an extra list item to show originally submitted data somehow, e.g. in tooltip. I think this is something @emmahodcroft suggested. We always process user submitted metadata, it can stay unchanged but in general we might reformat, so it's good to have the original data around to make the processing transparent.
@corneliusroemer, @theosanderson - please share your ideas! :)
Summary of suggested improvements from the comments below: