tarbaevbb / wnd-charm

Automatically exported from code.google.com/p/wnd-charm
0 stars 0 forks source link

wndchrm should fail gracefully if it encounters a file with unsupported extension #4

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago

What steps will reproduce the problem?
1. Take any .fit file
2. Rename it to have a different .extension ( I used .foo)
3. run command
wndchrm test -f1.0 training_set.foo

What is the expected output? What do you see instead?
Wndchrm tries to parse this .foo file like it's a list of files

Original issue reported on code.google.com by christop...@nih.gov on 17 Feb 2011 at 11:51

GoogleCodeExporter commented 9 years ago
Unfortunately, then we need a way to differentiate .fit files from file-o-file 
files because they appear in the same position on the command line.

In .fit files, the first three lines are "pure_numeric", indicating number of 
classes, number of features, and number of samples.  So a method like 
TrainingSet::isFitFile could open the file, read the first three lines, check 
that they're pure numeric, and return true or false.  If false, LoadFromPath 
would fall-through to interpret it as a file-o-files. If true, it would call 
ReadFromFile, returning an error if its a malformed .fit.

Original comment by i...@cathilya.org on 18 Feb 2011 at 2:37

GoogleCodeExporter commented 9 years ago
A check for isFileOfFiles is not as robust.  There is no requirement to have a 
<TAB><label> on the line, so the only check to make is "do we have a string on 
each line"?  Which is pretty silly.  It does check if a supported image 
extension is present in the leading string once it falls through from .fit to 
file-of-files though (at least I think so, or it should).

The degenerate case is if you've named your files numerically, without an image 
file extension and without labels.  Then your file-of-files will look like a 
.fit file to isFitFile().  Of course, extensions on image filenames are 
enforced, so this kind of file-o-files wouldn't work anyway.  But it would 
still be interpreted initially as a .fit file by isFitFile() if we implement it 
that way.

Original comment by i...@cathilya.org on 18 Feb 2011 at 3:37

GoogleCodeExporter commented 9 years ago
Fixed by implementing IsFitFile() and using it instead of checking for a .fit 
file extension.
IsFitFile() returns true if the first three lines of the file are pure numeric.

Original comment by i...@cathilya.org on 19 Feb 2011 at 7:14