jupyter / help

:sparkles: Need some help or have some questions? Please visit our Discourse page.
https://discourse.jupyter.org
291 stars 97 forks source link

OS X - 2GB+ datasets in Jupyter crash all open applications #104

Open PedroAlucinante opened 7 years ago

PedroAlucinante commented 7 years ago

Loading a large dataset (pandas dataframe constructed from ~2GB csv file) into the Jupyter Notebook hangs every open application and locks out the user interface of the entire system within two minutes. I am able to use the keyboard to open the "Force Quit" dialog, at which point killing the notebook server and all open browser sessions allows me regain control of the OS, but I still have to manually kill and restart all other running processes. Doesn't seem to be random; reading a dataset of this size (I have several) into the notebook has caused the issue 12 out of 12 times.

In case it is of interest, here are the other applications I was running during one or more of the crashes:

Chrome (used this for the Jupyter browser session, non-Jupyter tabs also hang) Terminal (ran server from here, but other open terminal windows also hang) Firefox Finder VLC Media Player TextEdit Skype Activity Monitor Slack Eclipse PyCharm Microsoft Remote Desktop

I am using OS X El Capitan 10.11.2 on a mid-2014 MacBook Pro with 16 GB RAM.

takluyver commented 7 years ago

Is this specific to Jupyter, or does the same happen if you load it in pandas in a Python terminal shell? I'd guess the data is larger in memory and it's filling up most of your RAM.

PedroAlucinante commented 7 years ago

Specific to Jupyter. Python terminal produces OutOfMemory error if I open a large enough dataset, but the problem is confined to that Python terminal. In Jupyter it hangs unrelated applications.

takluyver commented 7 years ago

Is it possible that it's trying to display it after loading it? Jupyter just tells Python to run your code, so if it's doing the same thing, there's no obvious reason why it would behave differently.

PedroAlucinante commented 7 years ago

Originally I was doing some math on it, but when I started trying to diagnose the issue I pared it down to just loading the .csv and got the same behavior.

takluyver commented 7 years ago

Can you execute the code and watch it in the monitor application to see how much memory it uses? If it is just loading, I'd still guess that it's taking over a load of memory and that's what is slowing things down. I don't know why it would be different to terminal Python, though.

PedroAlucinante commented 7 years ago

Tried this. It reached ~3GB after a few seconds, and then the monitor window stopped updating.

This may be OS X specific; I tried on Ubuntu last night and the notebook prints the OutOfMemory error same way as the python terminal.