krasin / stl

STL reader and writer in Go
MIT License
1 stars 2 forks source link

Reader is strict #1

Closed bmatsuo closed 9 years ago

bmatsuo commented 9 years ago

I tried decoding several random things I downloaded from Thingiverse.com. Some produce errors. The implementation here appears to be fairly strict to the ASCII specification as defined here (which is the authoritative source AFAICT).

But, other encoders seem to take more liberties with the format. For example this file (not from Thingiverse but from Printrbot calibration) is missing the name following "endsolid". The error produced has the following message:

Line expected to start with one of the prefixes: [facet normal  endsolid _40x10], the actual line is: 'endsolid'

That's just one example. I won't list all errors I saw here.

The software I use for 3D printing (RepetierHost + Slic3r) are able to parse/print these files successfully. I would like this package to parse them as well.

Given the state of the specification I think it's better to have the default be more liberal with what it accepts. What do you think about that, @krasin?

krasin commented 9 years ago

Hi Bryan,

I totally agree, the parser should accept STL files, which could be decoded unambiguously. The parser is strict only because I implemented it looking at the spec, and STL files I dealt with. It just happened that these files were fully conforming.

I will spend some time today on this issue, thanks for reporting!

krasin commented 9 years ago

Bryan,

please, take a look at the fix. Does it do what you requested?

Regarding to other places where the parser needs to be relaxed: I will try to run the parser of some number of STL files and will try to have all of them successfully parsed.

Certainly, I can't promise that I would find all the issues you have. Please, don't hesitate to file bugs. Given that I already have the context, it's OK to keep the explanation short, just a link to an STL file + the error message you see would be enough.

bmatsuo commented 9 years ago

Thanks for the fast response, @krasin. And don't sweat it. I understand why it was strict. Any tone of frustration in my comment should be directed at the file format itself, and the other encoders with loose behavior, not your implementation. Compromises made for interoperability are lame :frowning:.

Anyway it looks like all my problem files parse successfully now. That great! But, knowing your stance on the matter now, I'll be sure to let you know if I see more problem with spec strictness in the future.

Thanks again. :smile:

krasin commented 9 years ago

Bryan,

yeah, the issue is on STL format side. :) Even such a basic thing as detecting ASCII vs Binary format requires heuristics.

I am currently running the parser against Thingiverse dump (published by archive.org, https://archive.org/details/archiveteam-thingiverse-2014-02). We'll see how it goes.

krasin commented 9 years ago

Unsurprisingly, batch run has found some issues: #3, #4. I am reopening this bug and will keep it open until the whole archive is parsed.

bmatsuo commented 9 years ago

This is a really good idea! Although I would caution a little -- are you checking that problematic files are parsed successfully in other software? If other programs fail to read it then there's no reason to 'break' this decoder to do so. And loose decoders make the file format(edit: allow the format to) get out of hand in the first place. :frowning:

I load all my stl files into RepetierHost, which has sliced (most of) them with slic3r. Though, thinking now, rh may produce an intermediate stl file that is fed to slic3r.

krasin commented 9 years ago

So far, the number of problematic files is low (<0.5%) and the majority of the problems comes from spaces and tabs. These are safe to accept, since they don't make the input ambiguous.

If something more interesting pops up, I will file a bug and ask for your opinion. In general, I am also on the side of the parser being strict.

bmatsuo commented 9 years ago

Sounds good.

krasin commented 9 years ago

Ok, based on 10% files from Thingiverse, there're the following issues: #7, #8, #9. Only one of them (#7) seems important enough to fix, but I would like to hear your opinion first.

Generally, the parser accepts 99.8% files from thingiverse. 0.1% are invalid STL files (html, dwg, skp renamed to stl), and the remaining 0.1% are covered by the issues above.

bmatsuo commented 9 years ago

You're doing heroes' work, @krasin. Keep it up :smile:

krasin commented 9 years ago
50% run:

Total STL files: 112612
Success: 112090 / 99.54%
Failure: 521 / 0.46%
Crash: 1

Out of these failures:

Not an STL at all: 374 / 0.33%
Empty files: 42 / 0.04%
Fixable failures: 72 / 0.06%
Other legitimate failures (non-fixable, unless triangles are skipped): 33 / 0.03%

In other words, the parser is definitely right in 99.91%, and it's clear how to make it 99.97% right. After that, we'll need to skip triangles, and this kind of a behavior is currently hard to justify.

Detailed statistics for failures and the corresponding bugs:

The polysoup issue is probably another archive.org hickup. It's highly unclear, but there's nothing to parse locally.

krasin commented 9 years ago

@bmatsuo, at this point, all identified issues are either fixed or closed as "won't fix".

Given the amount of changes I made today, there's a chance of subtle failures which don't break the tests, but break the real function. I would appreciate if you gave the new version of the parser a try and tell if it works as intended.

Also, feel free to complain if some of the closed as "won't fix" issues seem to be important.

bmatsuo commented 9 years ago

You are a badass, @krasin. Thanks for putting in so much work.

I'll pull the updates and test them out. Though I'm not doing anything special/significant. So I don't expect problems.

Thanks again for the effort and the package in general :smile: