Edinburgh-Genome-Foundry / DnaFeaturesViewer

:eye: Python library to plot DNA sequence features (e.g. from Genbank files)
https://edinburgh-genome-foundry.github.io/DnaFeaturesViewer/
MIT License
584 stars 90 forks source link

`ValueError` from the `matplotlib` backend #15

Closed jolespin closed 5 years ago

jolespin commented 5 years ago

I'm just trying to plot 3 genes but I'm getting a ValueError from the matplotlib backend

features = [
]
data = df_gff3.iloc[loc_target]
start = int(data["pos_start"])
end = int(data["pos_end"])
print(end - start)
features.append(GraphicFeature(start=start, end=end, strand={"+":+1, "-":-1}[data["sense"]], color="teal", label=data["locus_tag"]))
record = GraphicRecord(sequence_length=end - start, features=features)
record.plot(figure_width=100)

# ---------------------------------------------------------------------------
# ValueError                                Traceback (most recent call last)
# ~/anaconda/envs/µ_env/lib/python3.6/site-packages/IPython/core/formatters.py in __call__(self, obj)
#     339                 pass
#     340             else:
# --> 341                 return printer(obj)
#     342             # Finally look for special method names
#     343             method = get_real_method(obj, self.print_method)

# ~/anaconda/envs/µ_env/lib/python3.6/site-packages/IPython/core/pylabtools.py in <lambda>(fig)
#     242 
#     243     if 'png' in formats:
# --> 244         png_formatter.for_type(Figure, lambda fig: print_figure(fig, 'png', **kwargs))
#     245     if 'retina' in formats or 'png2x' in formats:
#     246         png_formatter.for_type(Figure, lambda fig: retina_figure(fig, **kwargs))

# ~/anaconda/envs/µ_env/lib/python3.6/site-packages/IPython/core/pylabtools.py in print_figure(fig, fmt, bbox_inches, **kwargs)
#     126 
#     127     bytes_io = BytesIO()
# --> 128     fig.canvas.print_figure(bytes_io, **kw)
#     129     data = bytes_io.getvalue()
#     130     if fmt == 'svg':

# ~/anaconda/envs/µ_env/lib/python3.6/site-packages/matplotlib/backend_bases.py in print_figure(self, filename, dpi, facecolor, edgecolor, orientation, format, bbox_inches, **kwargs)
#    2073                     orientation=orientation,
#    2074                     bbox_inches_restore=_bbox_inches_restore,
# -> 2075                     **kwargs)
#    2076             finally:
#    2077                 if bbox_inches and restore_bbox:

# ~/anaconda/envs/µ_env/lib/python3.6/site-packages/matplotlib/backends/backend_agg.py in print_png(self, filename_or_obj, *args, **kwargs)
#     508 
#     509         """
# --> 510         FigureCanvasAgg.draw(self)
#     511         renderer = self.get_renderer()
#     512 

# ~/anaconda/envs/µ_env/lib/python3.6/site-packages/matplotlib/backends/backend_agg.py in draw(self)
#     394         Draw the figure using the renderer.
#     395         """
# --> 396         self.renderer = self.get_renderer(cleared=True)
#     397         # acquire a lock on the shared font cache
#     398         RendererAgg.lock.acquire()

# ~/anaconda/envs/µ_env/lib/python3.6/site-packages/matplotlib/backends/backend_agg.py in get_renderer(self, cleared)
#     415 
#     416         if need_new_renderer:
# --> 417             self.renderer = RendererAgg(w, h, self.figure.dpi)
#     418             self._lastKey = key
#     419         elif cleared:

# ~/anaconda/envs/µ_env/lib/python3.6/site-packages/matplotlib/backends/backend_agg.py in __init__(self, width, height, dpi)
#      85         self.width = width
#      86         self.height = height
# ---> 87         self._renderer = _RendererAgg(int(width), int(height), dpi)
#      88         self._filter_renderers = []
#      89 

# ValueError: Image size of 427363x153 pixels is too large. It must be less than 2^16 in each direction.
# <Figure size 7200x158.4 with 1 Axes>
Zulko commented 5 years ago

The sequence_length parameter must always be the length of the full sequence. Originally you had entered 100 (now it is start - end). What happens then is that Matplotlib understands that 100 nucleotides corresponds to the whole 10 inches of the figure width (which you provide with figure_width=10) so if the feature locations have indices like 10000, Matplotlib will attempt to create a figure with a width of 1000 inches (that's huge!) just to represent the index 10000. As this is way too many pixels, this gives you the error you observe.

If you want to only plot a part of a figure, have a look at the section about cropping in the README:

https://github.com/Edinburgh-Genome-Foundry/DnaFeaturesViewer#nucleotide-sequences-translations-and-cropping

EDIT: typos

jolespin commented 5 years ago

Thanks for getting back to me so quickly. Is this only for plotting very small sequences? Is there any way to autodetect the sequence_length and adjust accordingly with figure_width?

Zulko commented 5 years ago

To be clear, my point was that if you set figure_width=10 you are guaranteed to have a 10 inch wide figure whatever the sequence length, it will auto-adjust and there will be no problems.

In your script, a problem appears for a particular reason, which is that you set sequence_length=100 but some of your features have locations very much outside the 0-100 interval, which certainly isn't intended.

Can you quickly describe what the purpose of the script is? I might help me point you in the right direction.

jolespin commented 5 years ago

I'm trying to plot gene neighborhoods around a particular locus so I was going to try ~10 genes up and downstream of a gene in Alteromonas macleodii

Zulko commented 5 years ago

Then I think you should use a cropped record:

record = GraphicRecord(sequence_length=10000, # anything big
                       features=features)
cropped_record = record.crop((START, END))
cropped_record.plot(figure_width=10)
Zulko commented 5 years ago

I just noticed you used figure_width=100 now. This means a figure of 100 inches (=2.5 meters) in width! You should keep it to 10 inches (figure_width=10).

jolespin commented 5 years ago

Haha, yea a 2.5 meter figure is a little much for what I was trying to do. Thank you so much for responding! I was almost about to recreate the wheel and was very bummed out b/c your figures already look so good.

I ended up using this edit that you suggested

sequence_length = 4653851
features = list()
positions = list()
pad = 10
for k in range(-n_peripheral, n_peripheral+1):
    data = df_gff3.iloc[loc_target+k]
    start = int(data["pos_start"])
    end = int(data["pos_end"])
    strand = {"+":+1, "-":-1}[data["sense"]]
    label = data["locus_tag"]
    feature = GraphicFeature(start=start, end=end, strand=strand, label=label)
    features.append(feature)
    positions += [start, end]
record = GraphicRecord(sequence_length=sequence_length, features=features).crop((min(positions)-pad, max(positions)+pad))
record.plot(figure_width=10)
Zulko commented 5 years ago

Great, happy I could be of some help! Let me know if you run into any other trouble.