milkschen / chaintools

Utilities for the genomic chain format
MIT License
5 stars 2 forks source link

to_paf.py is very slow #10

Closed gsc74 closed 1 year ago

gsc74 commented 1 year ago

@milkschen and @nhansen, I'm using the following command to generate liftover-split.paf

python3 chaintools/src/chaintools/to_paf.py -c liftover-split.chain -t ../CHM13Y_L.fa -q ../GCA_000001405.15_GRCh38_no_alt_analysis_set.fasta -o liftover-split.paf

It seems that generating *.paf from to_paf.py is very slow. What is the alternate way to speed up the computation? And what is the estimated time for whole genome human references?

gsc74 commented 1 year ago

@milkschen and @nhansen, I'm using the following command to generate liftover-split.paf

python3 chaintools/src/chaintools/to_paf.py -c liftover-split.chain -t ../CHM13Y_L.fa -q ../GCA_000001405.15_GRCh38_no_alt_analysis_set.fasta -o liftover-split.paf

It seems that generating *.paf from to_paf.py is very slow. What is the alternate way to speed up the computation? And what is the estimated time for whole genome human references?

Target and query sequences were interchanged, and that was the issue.

milkschen commented 1 year ago

Hi @gsc74, Thanks for sharing the experience!