plotly / plotly.py

The interactive graphing library for Python :sparkles: This project now includes Plotly Express!
https://plotly.com/python/
MIT License
16.11k stars 2.54k forks source link

memory leak after plotting? #487

Closed choldgraf closed 6 years ago

choldgraf commented 8 years ago

I've noticed that after leaving interactive 3d plots open for a long time, the browser starts to run really, really slowly.

This might be on the JS side, but is there a function in python for clearing the cache / javascript memory / etc? Usually my browser will crash after I make about 10 trisurf plots, and I have to restart it as well as my python kernel.

Anybody else experience this?

choldgraf commented 8 years ago

note - this is plotting in offline mode, with pretty large 3d surfaces (60,000 triangles or so). It works fine for the first few plots, however.

jackparmer commented 8 years ago

FYI plotly.js WebGL folks @mikolalysenko @etpinard @monfera ^^^

@choldgraf We've seen slow memory leaks from WebGL plots in the past, though not in a while. Thanks for reporting!

choldgraf commented 8 years ago

hey folks - was anybody able to replicate this problem?

monfera commented 8 years ago

@choldgraf Hello Chris, have you a minimal example, e.g. specific type of plot? Also, how long does this issue generally take to manifest, and is it the specific browser tab that gets slow (slow to rotate/pan I guess?) or the entire browser, i.e. all its tabs? Which browser are you using, recent Chrome? Lastly, what hardware is it sitting on, especially, is it integrated graphics or dedicated video, and possibly, the size of the GPU memory?

choldgraf commented 8 years ago

Sorry @monfera , I should have been more explicit in my original post! I'm getting this problem when making trisurf plots, which is calling Mesh3d under the hood, I believe. I'm plotting a large amount of triangles, ~60K or so. I can try to put together some simple code to re-create the problem, though it's not going to be super clear-cut since there's not an error or anything.

monfera commented 8 years ago

@choldgraf thank you for the information! I've been wrapping up a functional addition to the WebGL based plotting. As soon as it's done, fixing memory leaks is planned to be the next item.

choldgraf commented 8 years ago

Sounds great, I appreciate the help. I'd open a PR if I was efficient at all with javascript, sorry to ask you to do it.

In a sort-of related question, do you know how optimized the webgl plotting scripts are? I'm trying to think if there are any low-hanging fruit for speeding up some of the heavier 3D visualizations.

monfera commented 8 years ago

Currently we have a somewhat linear data flow that recalculates more than what's needed if there's incremental data. Besides the memory leak issue, the other item I'm scheduled to work on next is incremental performance measurements and enhancements. The two tasks have a good bit of overlap because both issues involve profiling and identification of improvement points. I trust there'll be quite a few low hanging fruits but it's hard to speculate. One possibility is using typed float arrays upstream (from the moment of receiving the data) so as to increase efficiency and possibly reduce intermediate memory allocations, and another relates to speeding up data transfer into plotly.js, but I'd rather not pinpoint anything before solid profiling and some proof of concept.

choldgraf commented 8 years ago

Currently we have a somewhat linear data flow that recalculates more than what's needed if there's incremental data.

Ah, sounds like that might be related to some slowdown on plotting large datasets I've been experiencing (specifically again with Mesh3d, but I haven't really tried it with other plot types)

The two tasks have a good bit of overlap because both issues involve profiling and identification of improvement points. I trust there'll be quite a few low hanging fruits but it's hard to speculate.

I recently took a similar pass on the python 3d trisurf library, and there were a lot of really easy fixes that sped things up by a factor of 10+, so hopefully something similar for JS :)

Please let me know if there's anything I could help with...not super experienced with JS but perhaps I could help benchmarking in python or something

monfera commented 8 years ago

Thanks Chris for your offer! Since it's good to consider 'in vivo' use cases, it would actually be quite helpful to use a minimalistic case that still fairly characterizes the speed requirements and has as few parts as possible, e.g. constraining it into a single HTML + JS file where the HTML file links to the plotly script CDN URL and the JS code generates a (random, or mathematical) mesh, i.e. no actual data payload is necessary. Of course the Javascript calls var beginsAt = Performance.now(); [slow code]; var endsAt = Performance.now(); would exclude the large input generation.

We have examples on codepen where the whole thing is the most basic HTML file and a JS file with very simple invocation, not unlike the Python API call, e.g.

Data generation can be as simple as making a surface for a grid where z[i] = Math.sin(i) + Math.cos(i); as long as it generates roughly the same amount of work for plotly.js and representative of what you want to see speed up.

choldgraf commented 8 years ago

Cool - is it cool if I wrote something in Python? That's partially because the data transfer to JS also seems to be a potential point of speedup.

In this case, I could probably just simulate a sphere w/ the same number of triangle faces as the brains that I've been plotting...that sound alright?

On Mon, Jun 13, 2016 at 1:04 PM, Robert Monfera notifications@github.com wrote:

Thanks Chris for your offer! Since it's good to consider 'in vivo' use cases, it would actually be quite helpful to use a minimalistic case that still fairly characterizes the speed requirements and has as few parts as possible, e.g. constraining it into a single HTML + JS file where the HTML file links to the plotly script CDN URL and the JS code generates a (random, or mathematical) mesh, i.e. no actual data payload is necessary. Of course the Javascript calls var beginsAt = Performance.now(); [slow code]; var endsAt = Performance.now(); would exclude the large input generation.

We have examples on codepen where the whole thing is the most basic HTML file and a JS file with very simple invocation, not unlike the Python API call, e.g.

Data generation can be as simple as making a surface for a grid where z[i] = Math.sin(i) + Math.cos(i); as long as it generates roughly the same amount of work for plotly.js and representative of what you want to see speed up.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/plotly/plotly.py/issues/487#issuecomment-225692051, or mute the thread https://github.com/notifications/unsubscribe/ABwSHWCrAVO0A6uNWkxHhCxo2z609uInks5qLbe-gaJpZM4IpKnV .

monfera commented 8 years ago

@choldgraf Yes it sounds good, thank you! The simpler the code, the better (no other Python libs etc.) because I'll probably transcribe it into JS for quick iterations. If you see that something doesn't impact performance, it can be omitted. E.g. if the absence of a face color array doesn't speed up processing, then it can be ignored. Also it's helpful if you can share some measurement, e.g. measuring time in python, or in JS with Performance.now(), to have a rough guide of time ratio spent with interfacing vs. inside plotly.js, but don't worry about it if it's too tedious, I can do that too.

choldgraf commented 8 years ago

OK - is it alright if I put it in a jupyter notebook for the python side of things? That way you could use %lprun etc to get performance by lines and such...

On Mon, Jun 13, 2016 at 3:11 PM, Robert Monfera notifications@github.com wrote:

@choldgraf https://github.com/choldgraf Yes it sounds good, thank you! The simpler the code, the better (no other Python libs etc.) because I'll probably transcribe it into JS for quick iterations. If you see that something doesn't impact performance, it can be omitted. E.g. if the absence of a face color array doesn't speed up processing, then it can be ignored. Also it's helpful if you can share some measurement, e.g. measuring time in python, or in JS with Performance.now(), to have a rough guide of time ratio spent with interfacing vs. inside plotly.js, but don't worry about it if it's too tedious, I can do that too.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/plotly/plotly.py/issues/487#issuecomment-225724327, or mute the thread https://github.com/notifications/unsubscribe/ABwSHQvRk3ZKQDihhvrFcPifUuaikYV5ks5qLdWUgaJpZM4IpKnV .

monfera commented 8 years ago

Perfect, works great e.g. in a github repo file!

choldgraf commented 8 years ago

gimme 5 minutes :)

On Mon, Jun 13, 2016 at 3:18 PM, Robert Monfera notifications@github.com wrote:

Perfect, works great e.g. in a github repo file!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/plotly/plotly.py/issues/487#issuecomment-225725781, or mute the thread https://github.com/notifications/unsubscribe/ABwSHcaz_CyLnXkxgC0EhpDWxh5NJ3jcks5qLdcsgaJpZM4IpKnV .

choldgraf commented 8 years ago

OK, 30 minutes :)

Check out here:

https://github.com/choldgraf/sandbox/blob/master/notebooks/tests/test_trisurf.ipynb

This is based off of one of the plotly examples, scaled up so that it plots more faces, and also so that it uses the offline mode. I also tried running this a few times on my computer and it seemed to slow down over time (though I don't have any hard numbers on that)

LMK if that works.

On Mon, Jun 13, 2016 at 3:19 PM, Chris Holdgraf choldgraf@gmail.com wrote:

gimme 5 minutes :)

On Mon, Jun 13, 2016 at 3:18 PM, Robert Monfera notifications@github.com wrote:

Perfect, works great e.g. in a github repo file!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/plotly/plotly.py/issues/487#issuecomment-225725781, or mute the thread https://github.com/notifications/unsubscribe/ABwSHcaz_CyLnXkxgC0EhpDWxh5NJ3jcks5qLdcsgaJpZM4IpKnV .

monfera commented 8 years ago

Fantastic! Which Python interpreter version btw.?

monfera commented 8 years ago

Don't worry about it, I got it working with 2.7. The torus appeared but it took something like 5-6 seconds. Thanks again for this great setup!

choldgraf commented 8 years ago

yep - it's 2.7, I haven't tested it with 3+. For me the python code executes in ~5-6 seconds, and then it's another several seconds for the plot to actually show up (which I assume is stuff happening on the JS side).

ntfrgl commented 8 years ago

I would like to share this peculiar behaviour in Mesh3d plots. Maybe it is somehow helpful in looking for memory management issues, although the data payload is imported from a file.

The code is from #517 and it uses FigureFactory.create_trisurf() from the Python API (plotly 1.12.2) in offline mode within an ipython notebook. This leads to a regular oscillation in my RAM consumption when rendered inside Chrome, but not so for Firefox. The oscillation frequency seems to depend on the amount of interaction with the plot.

Chrome:

trisurf_chrome

Firefox:

trisurf_firefox

jackparmer commented 8 years ago

Hey @ntfrgl - This is definitely a plotly.js issue, I'm sorry you're running into it. @monfera is probably the most up-to-speed on what could be occurring here. Can you please reopen this issue with these (awesome!) screenshots in https://github.com/plotly/plotly.js?

monfera commented 8 years ago

@choldgraf we've identified a couple of possible reasons and work on PRs https://github.com/plotly/plotly.js/pull/724 and https://github.com/plotly/plotly.js/pull/726. As the work is ongoing, there may be other improvements, but these changes seem to solve a similar usability issue for the 2D scatterplot, which was the first target. Also, these kinds of improvements aim to fix the deterioration of performance during use, and it'll be a separate step to work to look into performance increases.

choldgraf commented 8 years ago

@monfera thanks for the update, you all are awesome. I wish I could be more constructive on the JS side :P

I think it's good to focus on the performance deterioration first, as that is more of a deal-breaker (as in, you can't use your browser if it's overloading memory). In terms of performance, I'm happy to chime in when you all are working on it after these PRs are resolved.

LMK if I can do anything.

choldgraf commented 8 years ago

also, +1 to using greek mythology in order to illustrate problems in PRs :D

monfera commented 8 years ago

@choldgraf fixes for the memory leaks had been merged and bundled by @etpinard a while ago, so this ticket could be evaluated for closing. We're also working on speedup of the initial render but that's a separate issue from this.

choldgraf commented 8 years ago

sweet, lemme give this a shot and see if problems pop up

choldgraf commented 8 years ago

Hmm, looks like a change in the trisurf code broke my brain plotting code. Out of curiosity why was show_colorbar added as a parameter, and not a keyword? Seems like it'd make more sense to just default to False rather than forcing it in the call to _trisurf.

monfera commented 8 years ago

@choldgraf I'm not yet familiar with the python bindings so maybe it could be made a separate ticket.

choldgraf commented 8 years ago

it'd be a pretty easy fix - basically you just add an =False flag to the parameter then it'll just default to False if nobody passes it explicitly in the call to the function, instead of throwing an error.

choldgraf commented 8 years ago

I can make a quick PR for it

monfera commented 8 years ago

@choldgraf I remember that in correspondence you shared observations about the slowness of the initial mesh render, but a quick search hasn't turned up an issue report from you.

May I ask that your observations regarding speed of initial rendering, possibly with an example and time measurement (and rough hardware strength) on an issue report? I remember you already submitted PRs for speeding up the python side of the initial rendering and we'd now focus on the JavaScript side, so could you please put this new issue on initial render speed under plotly.js?

monfera commented 8 years ago

@choldgraf thanks for the PR!

jonmmease commented 6 years ago

Closing as it looks like this discussion ran its course. Feel free to open a new issue if there are remaining loose ends to document.