The RAG backend should require a file/data size limit and ensure that it is capable of handling at least ~200Mb size files of all types. At the moment, some files around 100Mb+ fail during ingestion and document storage.
The issue seems to be isolated to 2 conditions: be inside of K8s and upload files that have many lines of data. Size is not the exact limitation to what can be uploaded. For example, a 200K line file seems to overwhelm the TextSplitter or other built-in method when inside of K8s. Seems like the K8s pod needs to be scheduled with a resource that is currently missing or defaulted to a low value.
The RAG backend should require a file/data size limit and ensure that it is capable of handling at least ~200Mb size files of all types. At the moment, some files around 100Mb+ fail during ingestion and document storage.
The issue seems to be isolated to 2 conditions: be inside of K8s and upload files that have many lines of data. Size is not the exact limitation to what can be uploaded. For example, a 200K line file seems to overwhelm the TextSplitter or other built-in method when inside of K8s. Seems like the K8s pod needs to be scheduled with a resource that is currently missing or defaulted to a low value.