Closed bmatsuo closed 9 years ago
Hi Bryan,
I totally agree, the parser should accept STL files, which could be decoded unambiguously. The parser is strict only because I implemented it looking at the spec, and STL files I dealt with. It just happened that these files were fully conforming.
I will spend some time today on this issue, thanks for reporting!
Bryan,
please, take a look at the fix. Does it do what you requested?
Regarding to other places where the parser needs to be relaxed: I will try to run the parser of some number of STL files and will try to have all of them successfully parsed.
Certainly, I can't promise that I would find all the issues you have. Please, don't hesitate to file bugs. Given that I already have the context, it's OK to keep the explanation short, just a link to an STL file + the error message you see would be enough.
Thanks for the fast response, @krasin. And don't sweat it. I understand why it was strict. Any tone of frustration in my comment should be directed at the file format itself, and the other encoders with loose behavior, not your implementation. Compromises made for interoperability are lame :frowning:.
Anyway it looks like all my problem files parse successfully now. That great! But, knowing your stance on the matter now, I'll be sure to let you know if I see more problem with spec strictness in the future.
Thanks again. :smile:
Bryan,
yeah, the issue is on STL format side. :) Even such a basic thing as detecting ASCII vs Binary format requires heuristics.
I am currently running the parser against Thingiverse dump (published by archive.org, https://archive.org/details/archiveteam-thingiverse-2014-02). We'll see how it goes.
Unsurprisingly, batch run has found some issues: #3, #4. I am reopening this bug and will keep it open until the whole archive is parsed.
This is a really good idea! Although I would caution a little -- are you checking that problematic files are parsed successfully in other software? If other programs fail to read it then there's no reason to 'break' this decoder to do so. And loose decoders make the file format(edit: allow the format to) get out of hand in the first place. :frowning:
I load all my stl files into RepetierHost, which has sliced (most of) them with slic3r. Though, thinking now, rh may produce an intermediate stl file that is fed to slic3r.
So far, the number of problematic files is low (<0.5%) and the majority of the problems comes from spaces and tabs. These are safe to accept, since they don't make the input ambiguous.
If something more interesting pops up, I will file a bug and ask for your opinion. In general, I am also on the side of the parser being strict.
Sounds good.
Ok, based on 10% files from Thingiverse, there're the following issues: #7, #8, #9. Only one of them (#7) seems important enough to fix, but I would like to hear your opinion first.
Generally, the parser accepts 99.8% files from thingiverse. 0.1% are invalid STL files (html, dwg, skp renamed to stl), and the remaining 0.1% are covered by the issues above.
You're doing heroes' work, @krasin. Keep it up :smile:
50% run:
Total STL files: 112612
Success: 112090 / 99.54%
Failure: 521 / 0.46%
Crash: 1
Out of these failures:
Not an STL at all: 374 / 0.33%
Empty files: 42 / 0.04%
Fixable failures: 72 / 0.06%
Other legitimate failures (non-fixable, unless triangles are skipped): 33 / 0.03%
In other words, the parser is definitely right in 99.91%, and it's clear how to make it 99.97% right. After that, we'll need to skip triangles, and this kind of a behavior is currently hard to justify.
Detailed statistics for failures and the corresponding bugs:
The polysoup issue is probably another archive.org hickup. It's highly unclear, but there's nothing to parse locally.
@bmatsuo, at this point, all identified issues are either fixed or closed as "won't fix".
Given the amount of changes I made today, there's a chance of subtle failures which don't break the tests, but break the real function. I would appreciate if you gave the new version of the parser a try and tell if it works as intended.
Also, feel free to complain if some of the closed as "won't fix" issues seem to be important.
You are a badass, @krasin. Thanks for putting in so much work.
I'll pull the updates and test them out. Though I'm not doing anything special/significant. So I don't expect problems.
Thanks again for the effort and the package in general :smile:
I tried decoding several random things I downloaded from Thingiverse.com. Some produce errors. The implementation here appears to be fairly strict to the ASCII specification as defined here (which is the authoritative source AFAICT).
But, other encoders seem to take more liberties with the format. For example this file (not from Thingiverse but from Printrbot calibration) is missing the name following "endsolid". The error produced has the following message:
That's just one example. I won't list all errors I saw here.
The software I use for 3D printing (RepetierHost + Slic3r) are able to parse/print these files successfully. I would like this package to parse them as well.
Given the state of the specification I think it's better to have the default be more liberal with what it accepts. What do you think about that, @krasin?