Closed zjean closed 3 years ago
My best guess based on that is that when this document was added, the null character \0
(used for marking the end of a string) was somehow extracted from the PDF content and saved in the content field. The API now complains about that character being in the content field when saving.
document_archiver -f -d 236
?Edit: I see that in the content field the value '\u0000' is present in several words. I see this character instead of the letter combination 'ti', like this (pasted from the json in the network tab, since the edit field shows a square): belas\u0000ngdienst Removing this in the content textarea in the ui doesn't help.
Hm. Is that document confidential? I'd really like to figure out if this is caused by tesseract, OCRmyPDF, or something in paperless.
Thanks! I cut the content, and pasted it back from notepad++. That allowed me to save the document. The document is quite confidential, sorry. If I encounter it another time with a less sensitive document, I will let you kno!
THe root of this issue is addressed as part of #794, so I'll go ahead and close this.
Describe the bug I added a new document via the web interface. Opened it, and tried to add the metadata. When saving the file, nothing happens. I checked the developr tools of my browser (Chrome), and saw a network error: HTTP 400 {"content":["Null-tekens zijn niet toegestaan."]}
To Reproduce Steps to reproduce the behavior:
Expected behavior I expect to save the document
Screenshots
Webserver logs
Relevant information
docker-compose.yml
,docker-compose.env
orpaperless.conf
.Edit: I see that in the content field the value '\u0000' is present in several words. I see this character instead of the letter combination 'ti', like this (pasted from the json in the network tab, since the edit field shows a square): belas\u0000ngdienst Removing this in the content textarea in the ui doesn't help.