Open despiegk opened 8 months ago
since we don't have tools in vlang to do this and also using any of ready to use tools will not be the same on all platforms (windows, linux and osx) I created this tool in rust which can be built and used as a binary https://github.com/ashraffouda/extractor and pr for crystallib is here https://github.com/freeflowuniverse/crystallib/pull/311
best way how to convert pdf, docx, html to list of text fragments
these text fragments can then be given to vilnus see #292
requirements
todo