lvdmaaten / bhtsne

Barnes-Hut t-SNE
Other
895 stars 239 forks source link

run_bh_tsne fails with FileNotFoundError in Python 3.5 #55

Open rjurney opened 7 years ago

rjurney commented 7 years ago

Note that this works in Python 2.7, but not in Anaconda Python 3.5 on OS X and Linux. Something is wrong with the file handling. I can't figure out what, but this file does not exist.

It looks like it opens it in read mode 'rb' and then writes to it? I'm not familiar with doing this.

Traceback (most recent call last):
  File "test/test_tsne.py", line 24, in reduce_dimensions
    result = bhtsne.run_bh_tsne(pca_result)
  File "/Users/rjurney/Software/pinpointcloud_worker/bhtsne/bhtsne.py", line 214, in run_bh_tsne
    for result in bh_tsne(tmp_dir_path, verbose):
  File "/Users/rjurney/Software/pinpointcloud_worker/bhtsne/bhtsne.py", line 159, in bh_tsne
    with open(path_join(workdir, 'result.dat'), 'rb') as output_file:
FileNotFoundError: [Errno 2] No such file or directory: '/var/folders/0b/74l_65015_5fcbmbdz1w2xl40000gn/T/tmph9x08ku8/result.dat'
Coolnesss commented 7 years ago

+1

Getting the same error here, on Mac OS Python 2.7.12. @rjurney you mentioned it works on 2.7, did you get it to just work out of the box?

EMCP commented 7 years ago

I have noticed this can occur when the job failed, and the resulting file is empty.. go to that location and double check what's there if you can.

For me this occurs in what looks like a crash at https://github.com/lvdmaaten/bhtsne/blob/master/bhtsne.py#L110

I debug line by line and it is on that line things abruptly crash in OSX

update : a crash in numpy is the culprit for me seeing this error you have.

https://github.com/numpy/numpy/issues/9254

EMCP commented 7 years ago

just to add-on to my previous post.. that same problem occurs for me in Python 3.x, and the solution is to AGAIN.. recompile numpy with openBLAS on Mac platforms.. hope that helps

EMCP commented 7 years ago

I cannot get this to run at all inside Jupyter Notebook (probably has to do with the underlying numpy issue(s) .. so I've avoided running this wrapper in anything but plain .py files.

tdekeyser commented 6 years ago

Line 210 in the Python wrapper seems to suggest that running the code via a Jupyter notebook is not recommended: bhtsne.py (210): print("Please run this program directly from python and not from ipython or jupyter.")

As @EMCP already mentioned, the issue with numpy may be the reason for the crash. This thread also suggests there is an issue with calling the fork() function on line 199 of the Python wrapper. A questionable workaround on OS X (which I have tested) is to simply run the code without using a separate process to initiate the bhtsne calculations. Commenting out the statements that have to do with fork() then correctly use the bhtsne binary. However, mind that this function is explicitly used for memory concerns, so this workaround may not work if the data is very large.

lvdmaaten commented 6 years ago

You may want to try https://github.com/lvdmaaten/bhtsne/pull/69 which removes the need for writing data to files and forking a new process in the first place.

I'm planning to add an FFI-based solution for Python soon that will do the same without the Cython dependencies.

duangenquan commented 6 years ago

I added a python wrapper based on boost, which process smoothly and visualize results at last. There is not need to recompile blas/numpy/etc. Hope this helps!