Text layer extraction and storage

ARK-Builders / arklib-android

Gradle wrapper for ARKLib, for usage in Android projects

MIT License

4 stars 6 forks source link

Text layer extraction and storage #64

Open kirillt opened 1 year ago

kirillt commented 1 year ago

For resources of kind "Document", it would be useful to extract and store text from them. E.g. for PDF resources, text layer should be similar to what is emitted by Linux utility pdftotext. The text layer can be used later for filtering resources by specified text in content, or for various text analytics (e.g. counting words).

kirillt commented 8 months ago

Rust side:

https://github.com/ARK-Builders/ark-components/issues/57

Android side:

[ ] create bindings to the Rust function
[ ] modify https://github.com/ARK-Builders/arklib-android/blob/main/lib/src/main/java/dev/arkbuilders/arklib/data/metadata/extractor/DocumentMetadataExtractor.kt and call the Rust function on PDF files
[ ] enable new feature in Navigator