Open johnomics opened 5 years ago
Please try the latest paftools. It should have resolved the issue.
Thanks for the quick response. This works for the GTF, so I can continue with that, but just to let you know, it doesn't work with the GFF (maybe a separate issue?):
$ paftools.js gff2bed -j GCA_000001405.15_GRCh38_full_analysis_set.refseq_annotation.gff
chr1 12227 12612 NR_046018.2|misc_RNA|N/A 1000 +
chr1 12721 13220 NR_046018.2|misc_RNA|N/A 1000 +
chr1 14829 14969 NR_024540.1|misc_RNA|N/A 1000 -
chr1 15038 15795 NR_024540.1|misc_RNA|N/A 1000 -
chr1 15947 16606 NR_024540.1|misc_RNA|N/A 1000 -
chr1 16765 16857 NR_024540.1|misc_RNA|N/A 1000 -
chr1 17055 17232 NR_024540.1|misc_RNA|N/A 1000 -
chr1 17368 17605 NR_024540.1|misc_RNA|N/A 1000 -
chr1 17742 17914 NR_024540.1|misc_RNA|N/A 1000 -
chr1 18061 18267 NR_024540.1|misc_RNA|N/A 1000 -
chr1 18366 24737 NR_024540.1|misc_RNA|N/A 1000 -
chr1 24891 29320 NR_024540.1|misc_RNA|N/A 1000 -
/mnt/lustre/groups/biol-tf-2018/software/miniconda3/bin/paftools.js:1578: Error: No transcript_id
if (id == null) throw Error("No transcript_id");
^
Error: No transcript_id
at Error (<anonymous>)
at paf_gff2bed (/mnt/lustre/groups/biol-tf-2018/software/miniconda3/bin/paftools.js:1578:25)
at main (/mnt/lustre/groups/biol-tf-2018/software/miniconda3/bin/paftools.js:2518:29)
at /mnt/lustre/groups/biol-tf-2018/software/miniconda3/bin/paftools.js:2535:1
Then use GTF. I think NCBI GFF3 is problematic more or less, and is inconsistent with the corresponding GTF. Gencode/ensembl GTF and GFF3 pretty much have the same information.
I am reopening this issue in case I may come back to it and make further improvement for NCBI GFF3.
Please try the latest paftools. It should have resolved the issue.
I found the GTF of human and mouse from ENSEMBL all have gene_id and gene_name, but some genes of other species (GFF from ENSEMBL) have gene_id attribute, but no gene_name attribute. How did you fix this problem, just ignore these genes which have "gene_id" attribute but not have "gene_name" attribute in the bam file? or use gene_id or something instead of gene_name?
I am still getting the original "...ReferenceError: name is not defined..." as above with minimap2 2.17-r941 (latest version of paftools.js I assume). I'm trying to use the --junc-bed option and only have the gtf.
Thank you for all your excellent work on minimap2, we use it every day.
I'm trying to convert the NCBI GRCh38 RefSeq annotation to BED format for aligning with minimap2 using paftools.js gff2bed. As per your advice, I'm using the no_alt_analysis GRCh38, and have got the full_analysis_set GFF and GTF from the same folder:
I get the following error when running gff2bed, with the GTF or GFF (minimap2 v2.17 release):
The
name
variable used at line 1593 is set in theif
statements at lines 1567 and 1574, but it is not initialised; instead, agname
variable is initialised at line 1562 but does not appear to be used.If I change the
name
variable togname
, the command works, but I only ever getN/A
for gene names; the NCBI annotations havegene_id
andgene
, but notgene_name
. However, changinggene_name
togene_id
orgene
, or adding additionalelse if
statements to check forgene_id
orgene
, doesn't work either.Please could you look into this? Should I be using a different annotation? Or is there a fix that will include the NCBI gene names? Many thanks.