If you see the document now, when it is public, it shows OCR: Textract, and shows the plaintext of the old OCR results that is cached.
If you then change the document to private, it updates the document and shows the correct OCR: Azure Document Intelligence and an updated plaintext (notice that the word Mayor is now above the line Department of Police, City of Chicago.
If I however, change the document back to public, the old cache persists, and it switches back to OCR: Textract.
You can test and replicate on this document: https://www.documentcloud.org/documents/23962135-p543161_muckrock_news_clear_data_foia_letterdoc which has the following OCR JSON results: https://s3.documentcloud.org/documents/23962135/p543161_muckrock_news_clear_data_foia_letterdoc.txt.json
If you see the document now, when it is public, it shows OCR: Textract, and shows the plaintext of the old OCR results that is cached.
If you then change the document to private, it updates the document and shows the correct OCR: Azure Document Intelligence and an updated plaintext (notice that the word Mayor is now above the line Department of Police, City of Chicago.
If I however, change the document back to public, the old cache persists, and it switches back to OCR: Textract.