Closed hannahPhys closed 10 months ago
Hi Hannah, We do not currently support Markdown files. The list of supported files can be found here: https://github.com/llmware-ai/llmware/blob/0bf704661c543e1d34138d72a4b41e13691f7da7/llmware/parsers.py#L180
But we're happy to add this to our feature list. I've turned this issue into an enhancement request
@turnham Can you inform me if this issue will be straightforward to address, or if it's more complex like the VectorDB issue? I'm eager to contribute to this repository.
@turnham can I use Makersuit google llm?
The work involved here would be adding local Python-based markdown parsing support (e.g not requiring connectivity to any particular external service).
Different markdown processors can have some amount of variance in the syntax they support, so the first step would be identifying and vetting the best python markdown processor that handles a broad range of syntax including older and newer markdown elements. I'm sure there are many Python markdown parsers to investigate.
And then the work would be about updating the Parser APIs to support for the creation of blocks, good error handling and building up a good test suite of markdown test documents that include a broad range of syntax (ideally all possible markdown tags/elements).
As someone who uses MD files for PKM, this would be a huge boon.
Llama hub has this as their implementation, I'm going to see if I can find other examples.
And honestly, why not just treat it as a text file?
@hannahPhys & @dahifi - thanks for your feedback on markdown documents - we have implemented it as your recommended, e.g., treating .md as .txt files - support is in the new version that we dropped today - please feel free to pull from the repo, or a new pip install llmware==0.1.9 - try it out - and keep the feedback coming - thanks for your engagement with the llmware community !
is library.add_files currently supporting md files? with my folder of pdfs and markdown files it only displays the pdfs in library output