Closed tuttiq closed 1 month ago
@knowtheory @jashkenas Any chance we get this merged?
Hi, I just ran into the same issue when trying to upgrade to Ruby 3.2. It would be great, if this could get merged. It should also not break compatibility to < 3.2-versions.
@tuttiq any news about this topic? I just encountered this issue. Maybe you used alternative for gem and could you tell me which one?
@tsotne-m (cc @tmaier) I ended up pointing the source for the gem (on my project's Gemfile) to my forked version: https://github.com/tuttiq/docsplit
Not great, but I figured this repository is no longer being maintained 🤷♀️ I don't plan on maintaining my fork either (since I'm not working on that project anymore), so I recommend you maintain your own forks if you need this gem long term.
@tuttiq Thanks a lot for response
I'm just working on getting rid of Docsplit as well and it depends on your usecase, but in my case of using it to extract word processing documents, it looks like switching to something like libreconv (or just LibreOffice directly) to convert the document to PDF and then use pdf-reader to extract the text is the way to go.
I consider to use Apache Tika in the future. Especially, a tika microservice.
It has a simple REST API to extract text. See https://cwiki.apache.org/confluence/display/TIKA/TikaServer#TikaServer-GettheTextofaDocument
Why this PR isn't merged? This is a common issue for all +Ruby 3.2 projects... CC: @jashkenas @anujaware
:heart:
Fix for this issue: https://github.com/documentcloud/docsplit/issues/158
Fixes compatibility with ruby 3.2.