streetslab / dimelo

python package for analysis of dimelo-seq & nanopore modified base data
MIT License
3 stars 5 forks source link

ValueError: The palette dictionary is missing keys: {'A+a.', 'C+m.'} #40

Open yxu405 opened 9 months ago

yxu405 commented 9 months ago

When I ran the command dm.plot_enrichment_profile(bam, "R10_analysis", bed, "A+CG", "/private/groups/migalab/dan/meth_R10/20211031_DD_HG002_nom6A_amp_negcon/20211031_DD_HG002_nom6A_amp_negcon_6mA_5mC_basecalled", windowSize=500, dotsize=1,threshC=190, threshA=129, cores=24)

I got the error

Parsing windows: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3000/3000 [08:40<00:00,  5.76windows/s]
Outputs windows: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3000/3000 [08:40<00:00,  4.72windows/s]
_______
DB file: /private/groups/migalab/dan/meth_R10/20211031_DD_HG002_nom6A_amp_negcon/20211031_DD_HG002_nom6A_amp_negcon_6mA_5mC_basecalled/20211031_DD_HG002_nom6A_amp_negcon_6mA_5mC_winnowmap_decap_filtered.db
processing 2351 reads with methylation above threshold for R10_analysis for bam: /private/groups/migalab/dan/meth_R10/20211031_DD_HG002_nom6A_amp_negcon/20211031_DD_HG002_nom6A_amp_negcon_6mA_5mC_winnowmap_decap_filtered.bam
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/private/home/yxu267/dimelo/dimelo/plot_enrichment_profile.py", line 216, in plot_enrichment_profile
    execute_single_plot(
  File "/private/home/yxu267/dimelo/dimelo/plot_enrichment_profile.py", line 356, in execute_single_plot
    sns.scatterplot(
  File "/private/home/yxu267/anaconda3/envs/dimelo/lib/python3.10/site-packages/seaborn/relational.py", line 609, in scatterplot
    p.map_hue(palette=palette, order=hue_order, norm=hue_norm)
  File "/private/home/yxu267/anaconda3/envs/dimelo/lib/python3.10/site-packages/seaborn/_base.py", line 838, in map_hue
    mapping = HueMapping(self, palette, order, norm, saturation)
  File "/private/home/yxu267/anaconda3/envs/dimelo/lib/python3.10/site-packages/seaborn/_base.py", line 150, in __init__
    levels, lookup_table = self.categorical_mapping(
  File "/private/home/yxu267/anaconda3/envs/dimelo/lib/python3.10/site-packages/seaborn/_base.py", line 234, in categorical_mapping
    raise ValueError(err.format(missing))
ValueError: The palette dictionary is missing keys: {'A+a.', 'C+m.'}

I double-checked the bam file and the mods were all there. I previously received the "databasae is locked" error and was able to bypass it by increasing the core number. Since the parser ran successfully, I am not sure why the plotting is complaining. Are there any suggestions on where to look to fix this error ?

Thank you so much!

thekugelmeister commented 9 months ago

Hi! This is not an error I've encountered before, so I'm going to be taking some wild stabs.

Looking at plot_enrichment_profile, we can see where this palette is defined and used: https://github.com/streetslab/dimelo/blob/7ff463273436d39ad4bf6dd0cbcc6c08cd4209cb/dimelo/plot_enrichment_profile.py#L354-L367 I find it interesting that we even have A+a and C+m specified here, as I'm fairly certain the rest of the package is hard-coded to only use the older Megalodon-specific A+Y and C+Z tags.

Out of curiosity, is your BAM file specifying the methylation types as A+a and C+m rather than A+Y and C+Z? If so, I would actually hazard a guess that parsing is silently failing too? I think? Although some of the other output makes me doubt this claim...

Just in case, can you open up the .db file using sqlite3 and count how many rows are in one or both of the tables (e.g. SELECT COUNT(*) FROM methylationAggregate_R10_analysis;)?

I'm not sure what else might be causing this. Are you able to successfully run dm.qc_report?

Also, sorry about the ongoing database locking error. We are actively working on a new version of the package that circumvents this issue entirely. Hopefully we will have something soon that we can get into your hands.