gjtorikian / html-pipeline

HTML processing filters and utilities
MIT License
2.27k stars 382 forks source link

V3 Release #347

Closed gjtorikian closed 1 year ago

gjtorikian commented 3 years ago

Mostly, this will modernize the gem by dropping old Ruby and Rails implementations. I'm also planning to user kwargs across the filters and methods.

If you're reading this and looking forward to adding a new breaking change, please open an issue to discuss.

Merged PRs for a future changelog:

digitalmoksha commented 2 years ago

@gjtorikian are you still thinking of moving forward with this branch? As we're going to be moving to Ruby 3, I'd be interested in seeing a release with the migration to GitHub actions and Ruby 3.1 in the testing matrix, on the current code base. I wonder if you've thought about splitting that out.

gjtorikian commented 2 years ago

Yes, I am! Within the next month or so, for sure. I’ve had quite a few irons in the fire but this is still a priority for me.

gjtorikian commented 1 year ago

@digitalmoksha Just popping in to say I haven't forgotten about this. Sorry. I expect to release a version at the end of November / early December, as I am rewriting the backend in Rust, for performance reasons.

digitalmoksha commented 1 year ago

@gjtorikian

Just popping in to say I haven't forgotten about this

Oh no worries! And I spaced on replying to this.

as I am rewriting the backend in Rust, for performance reasons.

Are you saying you're rewriting this repo in Rust, and dropping Ruby? I know commonmarker is going to be using Rust backend, but this gem as well?

Drop CamoFilter

Just noticed this above. Just as an fyi, we do actually use the CamoFilter, https://gitlab.com/gitlab-org/gitlab/blob/master/lib/banzai/filter/asset_proxy_filter.rb. If that does get removed, I guess we'll inline it.

gjtorikian commented 1 year ago

Are you saying you're rewriting this repo in Rust, and dropping Ruby? I know commonmarker is going to be using Rust backend, but this gem as well?

Yes, see #368 and the underlying repo for more info.

Just noticed this above. Just as an fyi, we do actually use the CamoFilter, https://gitlab.com/gitlab-org/gitlab/blob/master/lib/banzai/filter/asset_proxy_filter.rb. If that does get removed, I guess we'll inline it.

Can you upstream this to here? AssetProxyFilter looks much nicer.

digitalmoksha commented 1 year ago

Can you upstream this to here? AssetProxyFilter looks much nicer.

Yeah you bet!

gjtorikian commented 1 year ago

Can you upstream this to here? AssetProxyFilter looks much nicer.

Yeah you bet!

Hey, any interest in upstreaming?

digitalmoksha commented 1 year ago

Yeah of course, sorry I spaced on this. Let me spin something up...

Edit: spun up https://github.com/gjtorikian/html-pipeline/pull/379

digitalmoksha commented 1 year ago

@gjtorikian I'm curious, have you considered creating a Rust version of html-pipeline? I'm thinking about moving some of our filters into a Rust based pipeline for speed 🤔

gjtorikian commented 1 year ago

I haven't thought about it—but walk me through which part you would want/expect to Rustify ? All I do these days is wrap Rust in Ruby. 😆

V3 moved the bulk of the HTML selection-manipulation logic to using selma, which uses lol-html under the hood. html-pipeline is now just a dumb interface which takes an array of Ruby classes and iterates over their selectors; but the real manipulation work is done in Rust, and the Ruby class just acts as a sort of sugar on top. If anything, I'd like to move towards turning this into an async library (which is partly why I haven't solidified v3 yet--haven't had time to explore what this would entail).

digitalmoksha commented 1 year ago

I think it would be basically the same as what you have now, just in Rust - given a vector of filters, run through them calling their transformation function, passing results from one to the other.

Still coming up to speed on Rust's object oriented features, but it seems like using traits might give you the ability to not only gain default functionality of say a TextFilter, but allow you to override that functionality as well.

So if we have our own Rust crate which we can move some of our current Ruby filters/pipeline into, then I think we would see some performance win.

Haven't yet attempted to migrate to V3 of html-pipeline, so maybe that would be enough of a win. But I'm still considering having some type of pipeline ability in our Rust crate.

So I thought you might have something in the works ;-)