alan-turing-institute / reginald

Reginald repository for REG Hack Week 23
3 stars 0 forks source link

Add try except for github readers #174

Closed rwood-97 closed 5 months ago

rwood-97 commented 7 months ago

This PR adds try/excepts for the llama index github readers in order to fix the error we are getting loading our data in Azure.

Fixes #157

rwood-97 commented 7 months ago

This works with running the container!

Annoyingly how it is working is that if ANY request fails then it just skips that part of loading data. So whats currently running has skipped The Turing Way and Hut23 repo (we should see if this is always the two causing issues) but has all the info from RDS courses, wikis, etc etc.

Ideally we should add a boolean flag in llama-index to be something like fail_on_http_error and then we can force it to continue even if it fails with one file rather than raise exception. I will make a ticket on llama index about this and mb try address it next friday in OS hack session.