j0k3r / graby

Graby helps you extract article content from web pages
MIT License
365 stars 74 forks source link

ContentExtractor: clear image property on reset #307

Closed jtojnar closed 1 year ago

jtojnar commented 1 year ago

The property was introduced in https://github.com/j0k3r/graby/commit/b5d8ad48c7d01505c53c80382ccf11f2acb90148 without reset method being made aware of it. As a result, Graby might have returned an image from a previous item, when the Graby instance is re-used (like Wallabag does). Ideally, we would create a fresh ContentExtractor for each URL to preempt issues like this.

jtojnar commented 1 year ago

CI failure unrelated, opened https://github.com/j0k3r/graby/pull/308

jtojnar commented 1 year ago

We should also cherry-pick this to 2.x so that Wallabag can use it.