html-extract / hext

Domain-specific language for extracting structured data from HTML documents
https://hext.thomastrapp.com
Apache License 2.0
52 stars 3 forks source link

Consider switching gumbo-parser upstream to codeberg.org/gumbo-parser/gumbo-parser #31

Open thomastrapp opened 7 months ago

thomastrapp commented 7 months ago

https://github.com/google/gumbo-parser is no longer maintained. Arch has already picked up https://codeberg.org/gumbo-parser/gumbo-parser as the new upstream source.

brandonrobertz commented 7 months ago

Do you think we need to do any auditing on the new dependency? Willing to help out. Can test on windows + MacOS (intel), as well as linux of course.

thomastrapp commented 7 months ago

Do you think we need to do any auditing on the new dependency?

I have to do my due diligence and verify that their changes to gumbo-parser make sense to me, before adding the new dependency to the binary releases of hext.

If you want to, you can take a look at codeberg.org/gumbo-parser/gumbo-parser, and if you have any objections, let me know.

Thank you for sticking around @brandonrobertz

brandonrobertz commented 7 months ago

If you want to, you can take a look at codeberg.org/gumbo-parser/gumbo-parser, and if you have any objections, let me know.

I went through the changeset and they're very reasonable right now. Bugfixes, cleanup and slight modernization stuff. Not a whole lot of code honestly.