rrazaghi / modbamtools

Set of tools to manipulate and visualize modified base bam files
Apache License 2.0
49 stars 4 forks source link

Heterogeneity calculation does not work without a GTF file #35

Open pwh124 opened 1 year ago

pwh124 commented 1 year ago

-ht doesn't run if a GTF is not provided.

Runs

modbamtools plot --batch genes.bed --gtf gencode.v38.annotation.sorted.gtf.gz -ht --cluster --prefix test_batch_hp_removed_gm12878_batch --samples GM12878 --track-titles Genes --out . hp_removed_gm12878_ul_sup_megalodon_chr20.bam

Does not run (no gtf)

modbamtools plot --batch genes.bed -ht --cluster --prefix test_batch_hp_removed_gm12878_batch --samples GM12878 --out . hp_removed_gm12878_ul_sup_megalodon_chr20.bam

Runs (--cluster only)

modbamtools plot --batch genes.bed --cluster --prefix test_batch_hp_removed_gm12878_batch --samples GM12878 --out . hp_removed_gm12878_ul_sup_megalodon_chr20.bam

Does not run (-ht only)

modbamtools plot --batch genes.bed -ht --prefix test_batch_hp_removed_gm12878_batch --samples GM12878 --out . hp_removed_gm12878_ul_sup_megalodon_chr20.bam

Error message



Traceback (most recent call last):
  File "/home/phook2/miniconda3/envs/modbamtools/bin/modbamtools", line 8, in <module>
    sys.exit(cli())
  File "/home/phook2/miniconda3/envs/modbamtools/lib/python3.8/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/home/phook2/miniconda3/envs/modbamtools/lib/python3.8/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/home/phook2/miniconda3/envs/modbamtools/lib/python3.8/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/phook2/miniconda3/envs/modbamtools/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/phook2/miniconda3/envs/modbamtools/lib/python3.8/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/phook2/miniconda3/envs/modbamtools/lib/python3.8/site-packages/modbamtools/cli.py", line 269, in plot
    plot = Plotter(
  File "/home/phook2/miniconda3/envs/modbamtools/lib/python3.8/site-packages/modbamtools/modbamviz.py", line 100, in __init__
    assert self.num_tracks == len(self.titles)
AssertionError```
ryansohny commented 1 year ago

Dear Dr. Hook I am writing to ask about the status of this issue. I am having the same error (AssertionError). Has it been resolved yet?

pwh124 commented 1 year ago

The issue is that -ht requires a GTF to run. We have not worked on fixing the issue, but if you provide a GENCODE GTF, it may fix it.

What command are you running?

Paul

ryansohny commented 1 year ago

The exact commandline I used was: modbamtools plot --region chr6:31164337-31180731 --out . --heterogeneity --gtf gencode.v41.annotation.sorted.gtf.gz --prefix test_hetero_POU5F1 --samples sample_name sample_name_haplotagged_wIndels_splt.bam I was wondering if this might be due to not removing the hp tag in the bam file, but I'm not quite sure.

(And btw I forgot to mention in my previous question just how wonderful a visualization tool modbamtools is! Thanks for creating this)

pwh124 commented 1 year ago

Hi,

Thank you for your kind words!

Not sure about the HP tag.... but in reviewing what we had done above, I think the issue may be that you need the --cluster flag in order for it to work. As above, -ht without --cluster doesn't seem to work.

Try to add --cluster to your command and let me know, please.

Paul

ryansohny commented 1 year ago

Hmm. I'm not sure why it doesn't work with the added --cluster parmeter.

Here's the code I ran: modbamtools plot --region chr6:31164337-31180731 --out . --cluster --heterogeneity --gtf gencode.v41.annotation.sorted.gtf.gz --prefix test_hetero_POU5F1 --samples sample_name sample_name_haplotagged_wIndels_splt.bam

Error Message:


[-1, -1]
Traceback (most recent call last):
  File "/mnt/mone/Project/WC300/Tools/Anaconda3/envs/modbamtools/bin/modbamtools", line 8, in <module>
    sys.exit(cli())
  File "/mnt/mone/Project/WC300/Tools/Anaconda3/envs/modbamtools/lib/python3.8/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/mnt/mone/Project/WC300/Tools/Anaconda3/envs/modbamtools/lib/python3.8/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/mnt/mone/Project/WC300/Tools/Anaconda3/envs/modbamtools/lib/python3.8/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/mnt/mone/Project/WC300/Tools/Anaconda3/envs/modbamtools/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/mnt/mone/Project/WC300/Tools/Anaconda3/envs/modbamtools/lib/python3.8/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/mnt/mone/Project/WC300/Tools/Anaconda3/envs/modbamtools/lib/python3.8/site-packages/modbamtools/cli.py", line 334, in plot
    fig = Plotter(
  File "/mnt/mone/Project/WC300/Tools/Anaconda3/envs/modbamtools/lib/python3.8/site-packages/modbamtools/modbamviz.py", line 100, in __init__
    assert self.num_tracks == len(self.titles)
AssertionError

FYI, I sorted genocde gtf file using gff3sort.pl:

gff3sort.pl \
--precise \
--chr_order \
alphabet \
gencode.v41.annotation.gtf \
> gencode.v41.annotation.sorted.gtf

bgzip -c gencode.v41.annotation.sorted.gtf > gencode.v41.annotation.sorted.gtf.gz
tabix -p gff gencode.v41.annotation.sorted.gtf.gz

I apologize if I’m causing too much inconvenience with this issue.

Sohn