fozziethebeat / S-Space

The S-Space repsitory, from the AIrhead-Research group
GNU General Public License v2.0
203 stars 106 forks source link

File text is not readable? #67

Closed hbthien closed 9 years ago

hbthien commented 9 years ago

Dear all, I test the output format with "TEXT" and "SPARSE_TEXT" but they all bring the files which are not readable. They are the same as the binary files. I try to read the code in the file SemanticSpaceIO but it does not work. Please tell me how to show the content of these files? Thanks a lot.

davidjurgens commented 9 years ago

Hi,

Both formats should produce human-readable files. Could you please give an example of how you're trying to create these files and what errors you see if you try to read them?

Thanks, David

On Wed, May 13, 2015 at 11:25 AM, hbthien notifications@github.com wrote:

Dear all, I test the output format with "TEXT" and "SPARSE_TEXT" but they all bring the files which are not readable. They are the same as the binary files. I try to read the code in the file SemanticSpaceIO but it does not work. Please tell me how to show the content of these files? Thanks a lot.

— Reply to this email directly or view it on GitHub https://github.com/fozziethebeat/S-Space/issues/67.

hbthien commented 9 years ago

Hi David, I already tried with both text by using "--outputFormat=TEXT", (and "--outputFormat=SPARSE_TEXT") for testing LSAMain.java in eclipse with arguments:

-d corpus.txt -F --outputFormat=TEXT exclude=stopwords.txt my-lsa-output-no-stopwords.sspace

The file ".sspace" could not be readable. I don't know why. Please tell me to fix.

In addition, the file "corpus.txt" contains the punctuation (like ".", ","...) but when I debug the program, I recognize that they are not removed. I tried to find the function to remove them but didn't see.

davidjurgens commented 9 years ago

Your options look correct, so the file should be in a correct format. What error are you seeing when you try to read it? Also, what program are you using to view it?

On Wed, May 13, 2015 at 1:41 PM, hbthien notifications@github.com wrote:

Hi David, I already tried with both text by using "--outputFormat=TEXT", (and "--outputFormat=SPARSE_TEXT") for testing LSAMain.java in eclipse with arguments:

-d corpus.txt -F --outputFormat=TEXT exclude=stopwords.txt my-lsa-output-no-stopwords.sspace

The file ".sspace" could not be readable. I don't know why. Please tell me to fix.

In addition, the file "corpus.txt" contains the punctuation (like ".", ","...) but when I debug the program, I recognize that they are not removed. I tried to find the function to remove them but didn't see.

— Reply to this email directly or view it on GitHub https://github.com/fozziethebeat/S-Space/issues/67#issuecomment-101755832 .

hbthien commented 9 years ago

Hi, I don't know why the file using in ubuntu is unable to read, but it's ok for windows. Thank you.