dputhier / libgtftk

gtftk C Library and program
GNU General Public License v3.0
3 stars 2 forks source link

Processing time in add_attr #58

Closed dputhier closed 6 years ago

dputhier commented 6 years ago

Maybe it can help for issue #57. I found another issue which seems to be observed only with the file Global_GTF_category_moreThan200nc.gtf.

Indeed, it won't happen with another large file (e.g all human transcript from release 90 of ensembl). When I convert this file (Global_GTF_category_moreThan200nc.gtf) to ensembl format and try to join the file given as attached doc, it take a lot of time (10 minutes ?). The time is lost in the add_attribute C function this time.

    gtftk join_attr -i Global_GTF_category_moreThan200nc_ens.gtf -j to_join.txt -k transcript_id -n tx_geno_size -t transcript -V 2

to_join.txt

dputhier commented 6 years ago

This one seems to be fixed in 2714b90bfb90a15c74c6bf0012b4097e0ac99582