Closed Moo-cow closed 8 months ago
Thank you for your report, user @Moo-cow !
Would you like to try to reproduce this with the last version of PGAP? We had a release in October.
Hi~ I updated the software and database to 2023-10-03.build7061. The problem remains. It looks like a problem for small conti.
It looks like a problem for small conti.
We do have a check for those. Small contigs could be either removed, or you can push through using ./pgap.py --ignore-all-errors
Hi~ I try to add --ignore-all-errors, This doesn't work. The gene exceeds the length of the contig, these mistakes still exist
Could you please post cwltool.log
?
if it mentions contig name could you please post the FASTA header for this contig?
Please post your submol.yaml
file as well if you ran it with YAML file as a parameter.
Thanks
The contig:
NODE_67_length_271_cov_0.752577 CTCGGCGGTGACCTGATAGACCATGCCGACCAGCAGGATGGCCAGGGTGAAACAGCCGAT CAGCAGCAGATAGAACTGGATAAACAGGCGGCGCATTGGGCTCTCCTCTTGCCTGGATTA CGTCTGGGCATTGCCTTGCTGCCCCCTCATCCCCAACCCTTCTCCCGCAAGGGGAGAAGG GAGCAGAGAGCATCAAGGTCATTCGGCGGGGCTTATTCCCAGGCCTGGGGTACTAACAGG TAGCCCTTCTGGCGCACTGTCTTGATGCGGG
The anno:
NODE_67_length_271_cov_0.752577 Local region 1 271 . + . ID=NODE_67_length_271_cov_0.752577:1..271;Dbxref=taxon:244366;Is_circular=true;Name=ANONYMOUS;gbkey=Src;genome=chromosome;mol_type=genomic DNA;strain=None NODE_67_length_271_cov_0.752577 . gene 233 367 . - . ID=gene-LWNFPK02_4_005574;Name=LWNFPK02_4_005574;gbkey=Gene;gene_biotype=protein_coding;locus_tag=LWNFPK02_4_005574 NODE_67_length_271_cov_0.752577 GeneMarkS-2+ CDS 233 367 . - 0 ID=cds-LWNFPK02_4_005574;Parent=gene-LWNFPK02_4_005574;Name=extdb:LWNFPK02_4_005574;gbkey=CDS;inference=COORDINATES: ab initio prediction:GeneMarkS-2+;locus_tag=LWNFPK02_4_005574;product=hypothetical protein;protein_id=extdb:LWNFPK02_4_005574;transl_table=11
submol.yaml file:
topology: 'circular' organism: genus_species: 'Klebsiella variicola' strain: 'None' locus_tag_prefix: 'LWNFPK024'
cwltool.log in the attachment.
Thanks for your time.
At 2023-12-19 20:11:36, "Azat Badretdin" @.***> wrote:
Could you please post cwltool.log?
if it mentions contig name could you please post the FASTA header for this contig?
Please post your submol.yaml file as well if you ran it with YAML file as a parameter.
Thanks
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
从网易163邮箱发来的超大附件推荐客户端极速下载 cwltool.log (54.87M, 2024年1月4日 14:00 到期) 下载
I have the same issue too!
cwltool.log
It has not been attached.
Also please make sure that your small contigs are not marked circular by accident.
Can you please give the cut-off or threshold to define the size of contigs? i.e. which length should be classified as small or large?
200 bases
Describe the bug The length of the gene exceeds the length of the conitg
NODE_952_length_259_cov_0.802198 Local region 1 259 . + . ID=NODE_952_length_259_cov_0.802198:1..259;Dbxref=taxon:562;Is_circular=true;Name=ANONYMOUS;gbkey=Src;genome=chromosome;mol_type=genomic DNA;strain=IW NODE_952_length_259_cov_0.802198 . gene 71 319 . - . ID=gene-pgaptmp_004397;Name=pgaptmp_004397;gbkey=Gene;gene_biotype=protein_coding;locus_tag=pgaptmp_004397 NODE_952_length_259_cov_0.802198 GeneMarkS-2+ CDS 71 319 . - 0 ID=cds-pgaptmp_004397;Parent=gene-pgaptmp_004397;Name=extdb:pgaptmp_004397;gbkey=CDS;inference=COORDINATES: ab initio prediction:GeneMarkS-2+;locus_tag=pgaptmp_004397;product=hypothetical protein;protein_id=extdb:pgaptmp_004397;transl_table=11
To Reproduce Whether the genome is too fragmented to use pgap annotationSoftware versions (please complete the following information):