commonsense / luminoso

A visualizer for multi-dimensional semantic data
http://csc.media.mit.edu/analogyspace/luminoso
38 stars 8 forks source link

Unicode problem #1

Closed sgt101 closed 13 years ago

sgt101 commented 14 years ago

Hello, I am back coding after a summer of doing admin.

I got the new build and ran a job and I found the following:

Warning: Sorry, an internal error occurred. Could you please send the authors the text below and a brief note about what you were doing? Thanks! Traceback (most recent call last): File "/usr/local/lib/python2.6/dist-packages/Luminoso-1.2.1-py2.6.egg/luminoso/window.py", line 291, in analyze results = self.study_dir.analyze() File "/usr/local/lib/python2.6/dist-packages/Luminoso-1.2.1-py2.6.egg/luminoso/study.py", line 674, in analyze results.save(self.study_path('Results')) File "/usr/local/lib/python2.6/dist-packages/Luminoso-1.2.1-py2.6.egg/luminoso/study.py", line 518, in save self.write_core(tgt("core.txt")) File "/usr/local/lib/python2.6/dist-packages/Luminoso-1.2.1-py2.6.egg/luminoso/study.py", line 465, in write_core out.write(concept+', ') UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 2: ordinal not in range(128)

I have some noise in my data, but I think that there is an alternative function that could write non ascii.

I found that I could use iconv to convert the files to utf-8

kcarnold commented 14 years ago

I just pushed up some untested changes to make that function and one other write utf-8 encoded output.

It will be nice when we can switch to Python 3.x, which handles this stuff much better, but some of our dependencies are still not ported.

sgt101 commented 13 years ago

Thanks Ken !

sgt101 commented 13 years ago

I'll finish my current project and then will test the fix; the iconv clean up let me proceed and I need to get the stuff I am doing delivered today!

By the way I found that building the current version on a Mac or on Debian was hard and I found that setting up an Ubuntu image under Virtualbox and doing your recommended installation steps was very straight forward in comparison; it might be worth putting a note and a couple of links on the site to let other people know that this is a pretty well guaranteed route to setting up luminoso.

rspeer commented 13 years ago

I tried Ken's fix and discovered that one more fix was necessary in the case that an "interesting concept" was non-ASCII. I think the interesting concepts were getting double-decoded. I've pushed the fix.

sgt101 commented 13 years ago

thanks for the fixes.