Open BlazarKnight opened 6 months ago
I found this repo that may be helpful for extracting data from sources: https://github.com/opendatalab/MinerU
@Iain-Crowe I don't think we should use AI to gather the sources for other AI. I know it will be harder to do mainly, but gov docs do have more structure than your typical Wikipedia page. Furthermore, I want AI to be used sparingly due to their fallibility. Score (https://unu.edu/article/never-assume-accuracy-artificial-intelligence-information-equals-truth). I want AI to be more of a help line that if you right-click on a word or a doc, it explains the legal and technical language so you can understand it. Sorry for not being clearer in my original post, I didn't even know where to begin.
Could you explain in more detail what the AI element is intended to do? I'd be happy to recommend model types based on what is needed.