Closed inesliroulet closed 4 months ago
Hi @inesliroulet,
Thanks for your feedback! To answer briefly, please just remove the protein argument, -P proteins.fa
, and it should work correctly.
Here is a more detailed explanation: When you run LiftOn, it will extract proteins from the annotation file and store them in a FASTA file. If you've already run LiftOn and want to run it again, you can pass the file using the -P protein
argument to save some time. The reason you encountered the error is that the protein IDs listed in your provided file do not match the IDs stored in the GFF file. NCBI slightly changes the protein IDs.
This is my assumption based on some common mistakes that might happen. If it does not solve your problem, feel free to send me your data or files: kuanhao.chao@gmail.com.
Hello again @Kuanhao-Chao,
Thank you for your answer. Sadly, I have tried launching the tool again without the protein argument (lifton -g annotation.gff -t 30 centaurea_genome.fasta genome.fa
), and I get the exact same error again, so it seems like it's not the protein IDs.
Here are the links to the NCBI data I use :
I will also send them to you via email along with the genome I am trying to annotate.
Thank you very much for your help
Hi @inesliroulet,
Thanks for sharing the data with me. I ran LiftOn on your dataset and did not encounter the error. Here is the command that I ran:
lifton -g /data/reference/annotation.gff -o lifton.gff3 -polish -copies -sc 0.95 /data/target/centaurea_genome.fasta /data/reference/genome.fa
I have also sent an email with the results to you. Please let me know if you have any more questions!
Kuan-Hao
Hello,
I am trying to use this tool to annotate a genome assembly of a plant species, Centaurea corymbosa, using the data of a closely related species, Centaurea solstitialis. I have launched the tool like so:
lifton -g annotation.gff -P proteins.fa -t 30 centaurea_genome.fasta genome.fa
with annotation.gff, proteins.fa and genome.fa being Centaurea solstitialis data downloaded directly from NCBI.During the miniprot annotation step, I get this error:
I have checked that the 'rna-OSB04' feature is present in the gff file, but I don't really know what is wrong with it. I thought it might be because of the file format so I tried to launch the tool with the gtf file instead, but it finds 0 feature in the file and thinks it's an empty file.
I have also tried the tool with another, less-closely related species (Cynara cardunculus) and it worked just fine with the gff file (the command was
lifton -g annotation.gff -P proteins.fa -T rna.fa -t 30 centaurea_genome.fasta genome.fa
).Do you have an idea of what might be wrong in the case of C. solstitialis ?
Thank you for your help