marschall-lab / panacus

Panacus is a tool for computing statistics for GFA-formatted pangenome graphs
MIT License
73 stars 4 forks source link

Merge different chroms stats into one graph #7

Closed baozg closed 1 year ago

baozg commented 1 year ago

Hi,

How do we merge stats from all chromosomes into a single plot?

Best regards Zhigui

danydoerr commented 1 year ago

This is possible from coverage histogram data, i.e, the output of panacus hist ..., but requires you to manually merge the chromosome data together. This is simple, because they're just tables, and here is one way to do that with python/pandas:

import pandas as pd

FILES=['/your/list/of/chromosomes/chr1.hist.txt', '/your/list/of/chromosomes/chr2.hist.txt', ]
df = None

for f in FILES:
    _df = pd.read_csv(f, sep='\t', header=[1], index_col=[0])
    if df is None:
        df = _df
    else:
        df += _df

with open('/your/output/file.hist.txt', 'w') as out:
    df.reset_index().to_csv(out, sep='\t', index=False)

Then pass the output file to panacus, i.e., panacus growth /your/output/file.hist.txt and you're good to go.

baozg commented 1 year ago

After panacus hist and panacus growth, the final visualization will show #nodes instead of bps. I use -c bp for hist

danydoerr commented 1 year ago

Right, this is a known bug that I need to fix at some point. For the record, it’s just the label that is wrong (good that put this in the ticket here)On 8. Aug 2023, at 22:03, Zhigui Bao @.***> wrote: After panacus hist and panacus growth, the final visualization will show #nodes instead of bps. I use -c bp for hist

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>