The maximum output length is 4096 tokens. This, it seems, is only enough for JSON representing 2 or 3 pages worth of questions.
I think perhaps the way to go is to process each page of the document in a seperate call to the API.
This would also allow us to return results much sooner. You could process the first page, and then either continue to process the others in the background, or only do them when the user requests.
The maximum output length is 4096 tokens. This, it seems, is only enough for JSON representing 2 or 3 pages worth of questions.
I think perhaps the way to go is to process each page of the document in a seperate call to the API.
This would also allow us to return results much sooner. You could process the first page, and then either continue to process the others in the background, or only do them when the user requests.