Closed data-doge closed 7 months ago
Risks: Changing the format or cutting the document leaves us unclear what a claim reviewer would see if submitted that way.
Preferred: Messaging to tell Veterans that they cannot upload web pages converted into long PDFs, needs to be saved as standard 8.5 x 11 pages
Action items:
Did more digging into PDF creation on Safari. If they use the straight 'Export as PDF...' option, it'll be one single long page. If the use the 'Print...' option, and then select PDF, then it's broken down into multiple pages (as expected via Chrome). It does also change the formatting and removes some images... which risks rendering it in a manner the user might not want.
Example PDFs attached as reference. cats - Google Search - multi-page.pdf cats - Google Search.pdf
I did a bit of digging using iOS iPad. It seems that they do cut the pages, after it hits 200inches.
Chrome, Safari, and Edge all have a print option to save to PDF.
Safari on Mac and iPad has an 'export pdf' or 'create pdf' option that creates the long image.
Also, it's very hard to convert a long image PDF into smaller PDFs, since the trick is to do it with print modal when you're saving the original page. If you try to do that in Acrobat, it gives you an error, telling you to save as, rather than print, which prevents you from formatting it. Preview also doesn't have an option to tile to print.
@data-doge Did you sample any more instances of this error? Are there any other cases of large dimensions we need to plan for, or should we focus on resolving the long webpage issue?
Opened a conversation with CAIA since document dimensions might not be a silo'd occurrence with our form. This is the suggestion we got back from them:
Here's version of the error message I shared with the 530 EZ forms team (I removed the dimensions specific to their form): Your file can't have a [width/height] larger than [dimension]. Follow the instructions for your device on how to resize the file and try again.
It clearly defines the problem, and puts control back to the Veteran, but the guidance language leaves more to be desired. Resize might not make sense for our context (if they manage to resize to fit the length limit, it might not be legible). Having this validation at least identifies which file isn't working, and lets them make the decision on how to make it work.
The other way is to also include a suggestion to mailing the evidence in separately. It could be a fallback for all evidence types that can't be submitted digitally.
Started a document to capture recommendations. Would want to review with Julie to align on best next steps.
@eileen-coforma is talking to Julie later today about this
CAIA needs a few days to look at this. Will follow-up with them on Friday.
CAIA made recommendations for the error, and I’ve adapted their suggestion into this design.
A few notes, also as comments in the design file:
Open questions:
Updated design to remove additional info box, and separate into another ticket. Will follow-up with DS regarding error alert pattern. Also have to address where the error alert is placed (if multiple files are submitted).
@eileen-coforma to clear with the design system team if we can use an error alert for this action
Still in discovery, and can be deferred until after NOD testing is complete. We need all hands on deck, and this sits lower in priority.
Need to talk to design about the right way to show the error alert. @eileen-coforma
DS proposed a different error alert than initially designed. Revisiting error language with CAIA on Tues 4:30ET to try and condense message.
Attached design with error language to the ticket.
Action items:
@anniebtran do we know if the 78 in x 101 in. is referring to H x W or W x H? It's weird that it's two different numbers.
78 = width, 101 = height (code for reference)
@eileen-coforma to review the design with CAIA & Julie.
Spike work being done in another ticket so this ticket will be repurposed for the actual code work
@anniebtran Is there a way to determine which dimension exceeds the size limitation AKA height vs width? Wondering if we can be more specific in the error message.
Asked CAIA to review language, once confirmed, will async it with Julie.
@eileen-coforma Unfortunately I don't think so, at least from what I can tell in the validator code. It just checks for whether one of those dimensions has been exceeded and adds an error message stating what both limits are
Julie and CAIA have reviewed and approved the error message.
Two problems we are aware of and do not address in the current solution:
@anniebtran @Mottie to apply an estimate
@anniebtran how's this one going? Hoping to prioritize this if there's nothing more urgent on the bug triage homefront.
@anniebtran working on BE part of this. @Mottie working on FE, seems like a lot more work that needs to be done
Taking v3 component out of the scope of this
I talked to Kyle to get some background on why we never switched this feature on and make sure we weren't missing any glaring reasons to not use it — here are some notes from him:
I think Matt Self had told us to NOT enable it in prod because we were unsure if the requirements LH was using were actually REAL requirements or just there arbitrarily. For example the limit used to be 50Mb I think and then just one day we were like.... hey what happens if you just try to send a larger file downstream, and larger files worked or something like that. So Matt self wanted to try to get to the bottom of WHERE the actual requirements and bottlenecks are/were, and then make sure everyone enforces those. ... I dont think he or we got anywhere with that line of investigation. So... with all that in mind, you could test and make sure it still works as intended, and then if so, just flip it on. I mean we probably should have done that anyways because if we accept the upload, but it doesnt pass LH validation LH is going to error when sidekiq tries to upload it anyways, so it will go into the blackhole and the vet wont know, I think.... so turning it on is the same outcome (file does not get to the VA), but at least the Vet is aware of it and it is shown to them in the UI.
Had some interesting struggles getting this spun up on staging but I think we're good. Some notes on that:
Settings.decision_review.pdf_validation.url
) not updated since late 2022, so I went ahead and updated those in the parameter store. This fixed the issues with making the API call in staging but not in productioncurl -X POST 'https://staging-api.va.gov/services/vba_documents/v1/uploads/validate_document' \
--header 'apikey: <insert API key you're testing here>' \
--header 'accept: application/json' \
--data-raw '{}'
PR to enable the PDF validator backend on production is merged. Unless pods get refreshed before the usual prod deploy, we might need to wait until 2pm CT/3pm ET tomorrow in order to for the production environment to pick up the change.
@rocio-desantiago to validate in prod, if it looks good we can close this out
Looks good on prod ✅ closing
Value Statement
As a Veteran I want to submit evidence, which may include very large documents, for my NOD / SC So that my appeal can be evaluated with all the information I believe is relevant.
Background Context
Roughly 30% of the evidence submission errors we see downstream have the message: "Maximum page size exceeded. Limit is 78 in x 101 in." This is a Data Dimensions requirement, and the limit cannot be increased.
We investigated three relevant PDFs in S3. Two were 1-page continuous PDFs of an entire webpage (i.e. blog post from law firm; scientific paper). One PDF was a long report regarding burn pits, potentially originating from a web page. It is likely that all of these dimensions errors are cases where a PDF representation of a web page is normal in width, but very long. Though, while investigating a different error, we encountered a PDF with several pages, one of which was outlandishly large, with dimensions likely exceeding a thousand inches.
Implementation Notes (from Annie)
If we were to keep the current implementation and turn enable the setting that calls the /validate_document endpoint, we would have to tweak the transaction flow a little to handle the user password scenario. Some notes as I was thinking through how to approach that:
Additionally, we'll need to modify platform's file upload component to skip the component's/frontend password checking step in favor of letting the API call detect whether a file is encrypted with a password or not AND have it display the password prompt based on the response returned from the backend.
Looks like another team made a copy of the FileField component in order to customize it to their needs, so we might need to take this path. I'm not sure how much of a lift it would be, and it's definitely not DRY, but would probably be the least invasive approach.
Acceptance Criteria
For SC and NOD forms:
Design
Tech Tasks
Definition of Ready
Definition of Done