agshumate / LiftoffTools

GNU General Public License v3.0
24 stars 3 forks source link

comparing two independent assemblies of the same genome #6

Open splaisan opened 9 months ago

splaisan commented 9 months ago

Hi,

I have assembled Chlamydomonas from SRA long-reads and want to compare the annotations made by braker3 to the official reference annotations.

The command below returns multiple ID not matching between sets which is normal as the reference has gene names while my assembly has arbitrary braker3/ IDs.

liftofftools all -r Crei/Crei.gdna.fa -t ont_draft_assembly_softmask.fasta -rg Crei/Crei.gff3 -tg braker.gff3

2024-01-10 16:31:24,409 - INFO - Populating features
2024-01-10 16:31:54,744 - INFO - Populating features table and first-order relations: 270032 features
2024-01-10 16:31:54,744 - INFO - Updating relations
2024-01-10 16:31:56,855 - INFO - Creating relations(parent) index
2024-01-10 16:31:56,980 - INFO - Creating relations(child) index
2024-01-10 16:31:57,128 - INFO - Creating features(featuretype) index
2024-01-10 16:31:57,254 - INFO - Creating features (seqid, start, end) index
2024-01-10 16:31:57,381 - INFO - Creating features (seqid, start, end, strand) index
2024-01-10 16:31:57,518 - INFO - Running ANALYZE features
2024-01-10 16:31:57,770 - INFO - Populating features
2024-01-10 16:32:30,003 - INFO - Populating features table and first-order relations: 543842 features
2024-01-10 16:32:30,003 - INFO - Updating relations
2024-01-10 16:32:34,496 - INFO - Creating relations(parent) index
2024-01-10 16:32:34,765 - INFO - Creating relations(child) index
2024-01-10 16:32:35,207 - INFO - Creating features(featuretype) index
2024-01-10 16:32:35,446 - INFO - Creating features (seqid, start, end) index
2024-01-10 16:32:35,721 - INFO - Creating features (seqid, start, end, strand) index
2024-01-10 16:32:36,046 - INFO - Running ANALYZE features
/opt/miniconda3/envs/liftofftools/lib/python3.10/site-packages/liftofftools/liftofftools.py:80: UserWarning: There are no gene features with matching IDs in the reference and target annotation
  warnings.warn(mismatch_ids_warning)
Analyzing synteny
/opt/miniconda3/envs/liftofftools/lib/python3.10/site-packages/liftofftools/synteny/synteny.py:123: UserWarning: No features with matching IDs to plot
  warnings.warn(mismatch_ids_warning)
/opt/miniconda3/envs/liftofftools/lib/python3.10/site-packages/liftofftools/synteny/plot_gene_order.py:44: UserWarning: No features with matching IDs to plot
  warnings.warn(mismatch_ids_warning)
Extracting transcript sequences
Analyzing protein-coding clusters
Analyzing noncoding clusters
Analyzing variants

Is there something 'easy' I can do to use your tool to compare the two assemblies at annotation level and return quality metrics (completeness, matched genes, ...)

I tried to use AEGeAn but it also failed to run because of the GFF format.

thanks for your tie and advice Stephane

Andy-B-123 commented 6 months ago

Hi, not the author but would gffcompare be useful?

splaisan commented 6 months ago

Hi, not the author but would gffcompare be useful?

Hi, thanks for your suggestion.

Not to my experience as the contigs & coordinates differ between assemblies. I need a comparispn includind sequence alignments at some point.