Zipstack / unstract

No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents
https://unstract.com
GNU Affero General Public License v3.0
2.26k stars 126 forks source link

[FIX] Supporting multi chunk for large documents #335

Closed harini-venkataraman closed 4 months ago

harini-venkataraman commented 4 months ago

What

This PR enables the following.

  1. Support for proper response when large document is executued.

...

Why

Larger docuemnts context retrival was not optimized for non zero chunks.

...

How

Supporting Simple and Sub query retrieval.

...

Can this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)

No, a method refactored in prompt service enables the support. ...

Database Migrations

Added a field to Choices in model. ...

Env Config

Not Applicable. ...

Relevant Docs

...

Related Issues or PRs

...

Dependencies Versions

...

Notes on Testing

Reproducing the error - The response was proper for docuemtns with pages - 500, but the ambiguity in response happend only after pages greater that 1000.

...

Screenshots

Error reproduced Screenshot from 2024-05-13 22-31-22

Fix Screenshot from 2024-05-14 14-41-53

Screenshot from 2024-05-13 18-13-45

Checklist

I have read and understood the [Contribution Guidelines]().

sonarcloud[bot] commented 4 months ago

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud