The Git Repo as a data source feature is almost useless. Even with 12 gigs of ram dedicated to dialoq is fails to ingest most repos. I had an idea that may be more efficient and easy to implement. There are quite a few report to pdf tools that are fast an use almost no ram. Example: https://github.com/BankkRoll/repo2pdf this clones a repo, and can convert each file into a PDF. If you used a tool like this in the background then chunk and ingest the PDFs it may actually be more efficient.
The Git Repo as a data source feature is almost useless. Even with 12 gigs of ram dedicated to dialoq is fails to ingest most repos. I had an idea that may be more efficient and easy to implement. There are quite a few report to pdf tools that are fast an use almost no ram. Example: https://github.com/BankkRoll/repo2pdf this clones a repo, and can convert each file into a PDF. If you used a tool like this in the background then chunk and ingest the PDFs it may actually be more efficient.