rrazaghi / modbamtools

Set of tools to manipulate and visualize modified base bam files
Apache License 2.0
47 stars 4 forks source link

Heterogeneity calculation does not work without a GTF file #35

Open pwh124 opened 1 year ago

pwh124 commented 1 year ago

-ht doesn't run if a GTF is not provided.

Runs

modbamtools plot --batch genes.bed --gtf gencode.v38.annotation.sorted.gtf.gz -ht --cluster --prefix test_batch_hp_removed_gm12878_batch --samples GM12878 --track-titles Genes --out . hp_removed_gm12878_ul_sup_megalodon_chr20.bam

Does not run (no gtf)

modbamtools plot --batch genes.bed -ht --cluster --prefix test_batch_hp_removed_gm12878_batch --samples GM12878 --out . hp_removed_gm12878_ul_sup_megalodon_chr20.bam

Runs (--cluster only)

modbamtools plot --batch genes.bed --cluster --prefix test_batch_hp_removed_gm12878_batch --samples GM12878 --out . hp_removed_gm12878_ul_sup_megalodon_chr20.bam

Does not run (-ht only)

modbamtools plot --batch genes.bed -ht --prefix test_batch_hp_removed_gm12878_batch --samples GM12878 --out . hp_removed_gm12878_ul_sup_megalodon_chr20.bam

Error message



Traceback (most recent call last):
  File "/home/phook2/miniconda3/envs/modbamtools/bin/modbamtools", line 8, in <module>
    sys.exit(cli())
  File "/home/phook2/miniconda3/envs/modbamtools/lib/python3.8/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/home/phook2/miniconda3/envs/modbamtools/lib/python3.8/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/home/phook2/miniconda3/envs/modbamtools/lib/python3.8/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/phook2/miniconda3/envs/modbamtools/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/phook2/miniconda3/envs/modbamtools/lib/python3.8/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/phook2/miniconda3/envs/modbamtools/lib/python3.8/site-packages/modbamtools/cli.py", line 269, in plot
    plot = Plotter(
  File "/home/phook2/miniconda3/envs/modbamtools/lib/python3.8/site-packages/modbamtools/modbamviz.py", line 100, in __init__
    assert self.num_tracks == len(self.titles)
AssertionError```
ryansohny commented 10 months ago

Dear Dr. Hook I am writing to ask about the status of this issue. I am having the same error (AssertionError). Has it been resolved yet?

pwh124 commented 10 months ago

The issue is that -ht requires a GTF to run. We have not worked on fixing the issue, but if you provide a GENCODE GTF, it may fix it.

What command are you running?

Paul

ryansohny commented 10 months ago

The exact commandline I used was: modbamtools plot --region chr6:31164337-31180731 --out . --heterogeneity --gtf gencode.v41.annotation.sorted.gtf.gz --prefix test_hetero_POU5F1 --samples sample_name sample_name_haplotagged_wIndels_splt.bam I was wondering if this might be due to not removing the hp tag in the bam file, but I'm not quite sure.

(And btw I forgot to mention in my previous question just how wonderful a visualization tool modbamtools is! Thanks for creating this)

pwh124 commented 10 months ago

Hi,

Thank you for your kind words!

Not sure about the HP tag.... but in reviewing what we had done above, I think the issue may be that you need the --cluster flag in order for it to work. As above, -ht without --cluster doesn't seem to work.

Try to add --cluster to your command and let me know, please.

Paul

ryansohny commented 10 months ago

Hmm. I'm not sure why it doesn't work with the added --cluster parmeter.

Here's the code I ran: modbamtools plot --region chr6:31164337-31180731 --out . --cluster --heterogeneity --gtf gencode.v41.annotation.sorted.gtf.gz --prefix test_hetero_POU5F1 --samples sample_name sample_name_haplotagged_wIndels_splt.bam

Error Message:


[-1, -1]
Traceback (most recent call last):
  File "/mnt/mone/Project/WC300/Tools/Anaconda3/envs/modbamtools/bin/modbamtools", line 8, in <module>
    sys.exit(cli())
  File "/mnt/mone/Project/WC300/Tools/Anaconda3/envs/modbamtools/lib/python3.8/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/mnt/mone/Project/WC300/Tools/Anaconda3/envs/modbamtools/lib/python3.8/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/mnt/mone/Project/WC300/Tools/Anaconda3/envs/modbamtools/lib/python3.8/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/mnt/mone/Project/WC300/Tools/Anaconda3/envs/modbamtools/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/mnt/mone/Project/WC300/Tools/Anaconda3/envs/modbamtools/lib/python3.8/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/mnt/mone/Project/WC300/Tools/Anaconda3/envs/modbamtools/lib/python3.8/site-packages/modbamtools/cli.py", line 334, in plot
    fig = Plotter(
  File "/mnt/mone/Project/WC300/Tools/Anaconda3/envs/modbamtools/lib/python3.8/site-packages/modbamtools/modbamviz.py", line 100, in __init__
    assert self.num_tracks == len(self.titles)
AssertionError

FYI, I sorted genocde gtf file using gff3sort.pl:

gff3sort.pl \
--precise \
--chr_order \
alphabet \
gencode.v41.annotation.gtf \
> gencode.v41.annotation.sorted.gtf

bgzip -c gencode.v41.annotation.sorted.gtf > gencode.v41.annotation.sorted.gtf.gz
tabix -p gff gencode.v41.annotation.sorted.gtf.gz

I apologize if I’m causing too much inconvenience with this issue.

Sohn