Closed jorelllinsangan closed 1 year ago
Hi & thanks for raising this!
As mentioned in the linked PR, there's an alpha version 0.3.1-alpha.1 now available on NPM with a draft fix. It would be great if you could try it out and let us know whether it resolves your issue?
If possible I'd also like to fix our TRP-side API type definitions at the same time, as null
is unexpected here (per e.g. the GetDocumentAnalysis API doc). Could you confirm whether:
null
s you're seeing above on NextToken
, StatusMessage
, and Warnings
are coming directly from Amazon Textract API results? Or is there any transformation in your pipeline that could be adding them in?GetDocumentAnalysis
API where you're observing this? Any other APIs or particular special cases where you're seeing it?Hi thanks for looking into this!
No. We don't have a data pipeline. The results are straight from Textract. We just specify our own location where the results should be written to.
We actually chose not to use GetDocumentAnalysis. We manually download the analysis result from our s3 bucket.
We noticed in our project that Textract was logging warnings for a possibly truncated content when instantiating a
TextractDocument
I took a closer look at the content we were trying to parse and saw that the key
NextToken
was part of the JSON blob but is just set to null.I found that the constructor of the TextractDocument simply checks if the key exists and logs the warning if it does.
Probably should be checking for the existence of the key and its value.