MDU-PHL / pango-collapse

app to collapse Pango lineages for reporting
https://mdu-phl.github.io/pango-collapse/
GNU General Public License v3.0
10 stars 1 forks source link

what is collapse.txt meaning here? #4

Closed liamxg closed 1 year ago

Wytamma commented 1 year ago

collapse.txt is a file used to store the lineages you want to collapse.

You could create one like so:

echo "B.1.1.529" > collapse.txt

Then use it with pango-collapse:

pango-collapse input.csv --collapse-file collapse.txt -o output.csv 
Wytamma commented 1 year ago

That looks like an issue with downloading numpy? can you provide a complete traceback?

Wytamma commented 1 year ago

It looks like you have a connection error. Maybe your pip is out of date? Could try python -m pip install --upgrade pip

Wytamma commented 1 year ago

First you need to save the lineages you want to collapse as a csv file with the Lineage header (there's only one column so no commas required).

$ cat Lineages.csv
Lineage
BA.2.75.4
BA.5.2.6
BF.7
...

Then create a collapse file with your variants:

$ cat collapse.txt
# alpha
B.1.1.7
# beta
B.1.351
# delta
B.1.617.2
# epsilon
B.1.427
B.1.429
# lota
B.1.526
# omicron
B.1.1.529

Then run pango-collapse in --strict mode on the files:

$ pango-collapse -c collapse.txt -o collapsed.csv --strict Lineages.csv
pango-collapse 0.2.1

Collapsing up to the following lineages:
 - B.1.1.7
 - B.1.351
 - B.1.617.2
 - B.1.427
 - B.1.429
 - B.1.526
 - B.1.1.529

You can plot the results:

import pandas as pd
VOCs = {
  "B.1.1.7": "alpha",
  "B.1.351": "beta",
  "B.1.617.2": "delta",
  "B.1.429": "epsilon",
  "B.1.427": "epsilon",
  "B.1.526": "lota",
  "B.1.1.529": "omicron",
}
df = pd.read_csv("collapsed.csv")
df.Lineage_family.replace(VOCs, inplace=True)
df.Lineage_family.fillna('Other', inplace=True)
df.Lineage_family.value_counts().plot(kind='bar')

download (2)

Or do some exploration:

df.groupby('Lineage_family')['Lineage'].describe()
Lineage_family count unique top freq
Other 1314 1314 XC 1
alpha 9 9 Q.7 1
beta 4 4 B.1.351.5 1
delta 245 245 AY.29.2 1
epsilon 3 3 B.1.429.1 1
lota 1 1 B.1.526 1
omicron 295 295 BA.2.75.4 1
liamxg commented 1 year ago

could you please add the gamma, and send the output.txt to me, thanks.

Wytamma commented 1 year ago

Here is it -> liamxg.csv.

download (3)

liamxg commented 1 year ago

@Wytamma thanks. I will thank you in my paper. Very appreciated for this.

Wytamma commented 1 year ago

@liamxg You're very welcome! I will close this issue now, but please reopen if you have any other issues.

liamxg commented 1 year ago

okay, thanks.