Open joepio opened 1 year ago
@jrmuizel Could you take a look at my PR? Let me know if anything requires changing. Also, if you'd like help with maintenance of this repo, let me know.
I'll try to take a look this week
@jrmuizel I've made some additional changes this weekend. Let me know if there's anything I can do to help get this merged.
Thanks for your contribution. I finally had a chance to look at this and there's a lot here. I don't have a lot of time for this project so it would be easier for me if you broke it up into individual PRs that I can review and merge incrementally instead of one big one.
A couple of things:
I'm not a huge fan of the mass unwrap removal. Without a fix for https://github.com/rust-lang/rust/issues/54144 it's a lot more painful to debug a propagated error instead of a panic. I wouldn't really recommend running this software in production but if you really want to perhaps you can use catch_unwind
. If you pdfs that trigger particular panics I'm happy to fix those on a case by case basis.
Regarding testing, I don't really want to bloat the repo with large test PDFs. Perhaps a good option would be to do what pdf.js does and use urls in pdf.link files (https://github.com/mozilla/pdf.js/tree/master/test/pdfs).
I'd also prefer the changes in the clippy commit be broken out into individual fixes.
Sorry if these requests come across as a bit particular. If you feel differently about them or don't want to spend more time on these changes I totally understand you keeping your own fork. In the mean time I'll continue going through your changes trying to pull out the parts that look good to me.
@jrmuizel Thanks for the thorough response! And good to see you've already merged the readme change.
Without a fix for https://github.com/rust-lang/rust/issues/54144 it's a lot more painful to debug a propagated error instead of a panic
Not sure if I understand this correctly, but I think there are some other approaches that could give you more context while debugging. For example, the anyhow
crate provides stack traces within normal Results, using RUST_BACKTRACE=1
and the backtrace
feature. Or you could create a custom error, and set a breakpoint in the code related to instantiating that custom error. I really think keeping the .unwrap
is not the right call for a published library.
I'd also prefer the changes in the clippy commit be broken out into individual fixes.
Well, that's the first commit. You could cherry pick that one perhaps?
Regarding testing, I don't really want to bloat the repo with large test PDFs. Perhaps a good option would be to do what pdf.js does and use urls in pdf.link files (https://github.com/mozilla/pdf.js/tree/master/test/pdfs).
If we re-fetch all pdf files every test run, testing becomes thousands of times slower, which harms productivity. However, we could store them in a local directory that is in .gitignore
.
If you feel differently about them or don't want to spend more time on these changes I totally understand you keeping your own fork
Well, it all boils down to the error handling. If you really want to keep the panics, I think I'll work on the fork.
Without a fix for https://github.com/rust-lang/rust/issues/54144
I'd recommend using something like thiserror, which has support for stabilized Backtraces, while allowing for specific error kinds. So you can just include a backtrace for every error. Otherwise you may use something like the SNAFU crate, which allows backtrace under specific flags, so you can restrict them to debug builds.
I wouldn't really recommend running this software in production but if you really want to perhaps you can use catch_unwind
I strongly recommend against using catch_unwind as some kind of try-catch method for general exceptions. Panics are supposed to be unfixable situations, a PDF parser panicking my complete application by reading some sort of weird PDF is definitely an unhappy surprise (and a major DoS attack path). Even for hobby projects I personally avoid any libraries that do this.
You're definitely free to do otherwise, just voicing my 2 cents.
this is such a stand thing to do in a library these days. panics really make this unusable. anyhow + thiserro + ? would be great.
Any word on this? I would love to have proper error propagation. Currently having to do catch_unwind
and suppressing println
statements for Unicode mismatch
is not very idiomatic Rust I'd say.
@joepio @HarrisDePerceptron @0xpr03 couldn't agree more on the need for stable libraries and on the use of anyhow + thiserror. As others have stated I'm also hesitant to incorporate this lib into production code if it will end up crashing somewhere down the line.
Is there any update @jrmuizel on whether this can be integrated soon?
Hi there! This PR turned out a little larger than I anticipated, but I think it bings some important improvements:
panic
s from the code (includingunwrap
,expect
,assert
), and introduces returningErr
s instead. See #47.Res
custom result type/docs
folder with example pdf files that should parse, closes #36smask
errors to be more lenient #24 #27lopdf
dependencylopdf
closes #44println
, replace withdlog
closes #52 closes #45This PR also makes some other open PRs unnecessary: #25 #45 #44, also has overlap with #49
It still requires some work though. @jrmuizel could you help out to fix the last of the types?Some open questions / to still be improved upon:
panic
s! This PR doesn't fix them all. (maybe I'll come back later to do that though)from_opt_obj
has many panics still, in particularlib.rs
into a bunch of modules could help make the project a bit more easy to navigate.Type0Func
,Type2Func
,CalGray
,CalRGB
. The latter ones are public - not sure if these are used anywhere outside? Their fields are unused in the repo.