fozziethebeat / S-Space

The S-Space repsitory, from the AIrhead-Research group
GNU General Public License v2.0
203 stars 106 forks source link

colon as file separator replaced by semicolon #56

Open lkrcmar opened 9 years ago

lkrcmar commented 9 years ago

Hopefully, colon was replaced by semicolon where needed. Some errors disappeared.

davidjurgens commented 9 years ago

Hi Lubomír,

Thanks for this patch! Could you tell us more about what problem this is fixing? What errors were you seeing? Also, what platform were you running (Windows 8, Java 8, etc.) when you saw the errors? If there are other places where these errors might occur, I'd like to understand your problem so we can fix everything.

Thanks, David

lkrcmar commented 9 years ago

Hi David,

thanks for quick reply.

More about the problem: Absolute paths on Windows platform (version does no matter) start like C:/, D:/, E:/ etc. Therefore, an absolute path to some file with stopwords could be e.g.: C:/dir/dir2/stopwords.txt. Because TokenFilter.java uses ":" to split string to list of absolute file paths, the string is splitted into "C" and /dir/dir2/stopwords.txt. These are two invalid windows filepaths. Because of ":" splitting and because of using absolute file paths, all tests in TokenFilterTests.java end up with IO Error: java.io.IOError: java.io.FileNotFoundException: C (The system cannot find the file specified)... ...

Generally, this means I cannot use absolute file paths now when specifying token filter: I can use: include=top-tokens.txt:test-words.txt,exclude=stop-words.txt I cannot use: include=C:/dir/top-tokens.txt:C:/dir/test-words.txt,exclude=C:/dir/stop-words.txt

(I am running Windows 8.1. I did not change origin pom.xml - this, hopefully, means I use java 1.6.)

Finally, I am sorry, I did not want to pull (now) to origin s-space but to my fork of s-space. Please, consider the consequences of the change carefully.

you are welcomed, Lubomír