nextstrain / nextclade

Viral genome alignment, mutation calling, clade assignment, quality checks and phylogenetic placement
https://clades.nextstrain.org
MIT License
210 stars 58 forks source link

Nextclade: Add a PINNED column 1,2,3,4 to easily count sequences of a lineage after sorting for them #1085

Closed FedeGueli closed 1 year ago

FedeGueli commented 1 year ago

Hi thanks for your great work. I would suggest if it is possible to add at the start of the table before ( = at the left of) the ID Column a PINNED row with fix numbers from 1 to X to easily count numbers of sequences after sorting for (Pango lineages or Unaliased or whatsoever).

THX

Schermata 2023-01-15 alle 07 46 50
ivan-aksamentov commented 1 year ago

@FedeGueli Hi Federico,

This sounds useful.

Just to understand your use-case better, do you need to see some sort of an aggregate statistics, such as:

FedeGueli commented 1 year ago

Thanks Ivan! that sounds great! but if it is too time expending it is enough a fixed column with 1 2 3 4 where i can count number of sequences belonging to a sublineage after sorting for example for unaliased lineage: 1-10 BA.5.1 = 10 11-16 Ba.5.2 = 6 and so on.

ivan-aksamentov commented 1 year ago

@FedeGueli I implemented a prototype in https://github.com/nextstrain/nextclade/pull/1092

A preview app can be tested here: https://nextclade-git-feat-web-row-index-nextstrain.vercel.app/

Is this what you wanted?

FedeGueli commented 1 year ago

@FedeGueli I implemented a prototype in #1092

A preview app can be tested here: https://nextclade-git-feat-web-row-index-nextstrain.vercel.app/

Is this what you wanted?

Perfect!! It is exactly what i was meaning!

ivan-aksamentov commented 1 year ago

@FedeGueli In https://github.com/nextstrain/nextclade/pull/1096 I made a prototype of simple clade statistics. Is this something that can help your use-case?

Here is a preview app: https://nextclade-git-feat-web-clade-stats-nextstrain.vercel.app/ The layout is not perfect yet and there might be bugs.

The team is hesitant to add complex post-processing features like this, so I don't guarantee it appears in the main app. But if there's strong need, it will be easier to justify.

P.S. By "post-processing" I mean that the same results can be easily generated by downloading the TSV output file, opening it in Excel and plotting the charts you need or processing it in any way you like in Excel or with Python & pandas etc.

FedeGueli commented 1 year ago

Hi Ivan! Sincerely i found your #1096 very good to visualize immediately results, and for the purpose to scan for variants it can help a lot while scanning big batches of sequences. if it is handy why not i would say? but obviously i understand that it is a duplicate of things easy to do with the TSV. Sincerely i dont think i am the right person to talk about this kind of things i ignored the existance and use of the buttons there, really cool to know them now.