laurent22 / joplin

Joplin - the privacy-focused note taking app with sync capabilities for Windows, macOS, Linux, Android and iOS.
https://joplinapp.org
Other
46.2k stars 5.03k forks source link

Attached files are corrupt after sync, download and decryption #8979

Closed pjsgsy closed 1 year ago

pjsgsy commented 1 year ago

Syncing using webdav to IIS Both Windows clients use 2.13.1 The andriod client is 2.12.3

Create a simple note on the first windows client I configured. Call it 'test note'. Drag and drop a single image (jpg) (or other attachment such as a pdf) into the note. This works all as expected on that original client the note was created on. Sync the note. That is OK.

Sync a different client. The note is synced, no errors are reported. Click the note - The timer appears and the attachment downloads, as you would expect. It says it is decrypting and then it is decrypted. Sync status shows the note download and decrypted. But, the image is a broken link (as the inline viewer is unable to show it), and the pdf is the same. Download the attachment and try to view with external viewers and they are all reported as broken format. Images will not show. PDF's reported as corrupt. All normal note content and HTML etc seems to download OK. It's just attached files that seem to be the issue.

Double-check encryption status on all devices. All say they are OK and are using the same key. No errors at all as far as I can see. Simply the fact that the attachments are somehow corrupt after download. Perhaps the encryption is not correctly decrypting. That is the only thing I can think off. I would try removing the encryption for test, but it took a long time to setup and I have a lot of notes, so would rather not break it all unless needed!

Not OS or client specific as the same issue is reproduced on this setup with Windows 10 and the windows client as well as Andriod 13 and the android client.

The log appears to show the note being downloaded and decrypted OK on the client

debug.txt

wh201906 commented 1 year ago

Could you please post the original file and the broken file there?

pjsgsy commented 1 year ago

Thank you for the response.

Here is an example pdf. The one ending GOOD is from the windows client it was created on, and the one ending BAD is on another windows client which downloaded the note after sync and click on the note. I included some screenshots too. I right clinked the link in each client and uses 'Save As' to get the file. I note the file sizes are hugely different. I opened the file in a text editor and actually see a HTML error in there! You will see it. So, this is the cause of the issue. Why however it cannot find the file it requested, nor does Joplin report any sort of error I do not know.

Good one image

Bad one

image

new-guide-to-installlation-of-pv-systems-mcs_20130530161524 BAD.pdf

new-guide-to-installlation-of-pv-systems-mcs_20130530161524 GOOD.pdf

pjsgsy commented 1 year ago

Just in case you cannot see / open the second BAD pdf file for any reason. It actually contains this HTML error page... `<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

404 - File or directory not found.

404 - File or directory not found.

The resource you are looking for might have been removed, had its name changed, or is temporarily unavailable.

`

wh201906 commented 1 year ago

That is the only thing I can think off. I would try removing the encryption for test, but it took a long time to setup and I have a lot of notes, so would rather not break it all unless needed!

You can use the portable version, which won't affect the installed Joplin

wh201906 commented 1 year ago

Have you ever tried other sync methods?

pjsgsy commented 1 year ago

Have you ever tried other sync methods?

No - I set it up with Webdav and as all the notes were actually syncing, I assumed it was working. Well, for at least a week or so until I tried accessing some attachments remotely. Odd that it is only the attachments. Note content syncs fine and all the notes are created on the clients OK. I do have them set to 'Auto', to prevent downloading all the attachments by default to save space on the underpowered clients! Maybe that has something to do with it. As the server is returning a standard HTTP error though, it seems to me that it might be a good idea for Joplin to recognise that and throw some sort of error message re the failure to fetch the file.

I will try and get some information from the web server log files as to what the exact request being made is. Perhaps that will give us some clues.

pjsgsy commented 1 year ago

I have done some more investigation on this. I decided to use the Joplin function to resync and upload all files to the server again. Once it completed (couple hours) I then checked the file for the PDF on the device it was created on

You can see here, using the reveal option in folder in Joplin, the files

image

If I now search in the webdav folder on the server, I see the file name, but it is just 2k in size...

image

I found the PUT request in the server log. It seems OK

2023-10-01 15:56:41 192.168.2.43 PUT /ffe27ad94a8625b3608ee15f265ad02a.md - 443 pjsxxx 192.168.2.100 Joplin/1.0 - 204 0 0 11

I searched the local joplin log and have these entries for that file. Here are the relevant entries I could see

2023-10-01 16:56:40: Synchronizer: "Sync: createRemote: remote does not exist, and local is new and has never been synced: Resource: (Local ffe27ad94a8625b3608ee15f265ad02a)"

Here is the sync status

image

I don;t see any more detailed debug info, and the logs and application report no errors at all, but it seems Joplin is not uploading the full file successfully.

Is there any way to get more log info, more verbose logging, that might point the way to what is going on?

wh201906 commented 1 year ago

If I now search in the webdav folder on the server, I see the file name, but it is just 2k in size...

Your resource file should be located in .resource folder on the server. I guess the .md file you mentioned only holds the metadata of it.

For example, The resource id of your file is ffe27ad94a8625b3608ee15f265ad02a, and the encrypted file size is 6993kB. So theoretically you can find a file named ffe27ad94a8625b3608ee15f265ad02a in the .resource folder on your server, and the size of it is also 6993kB

wh201906 commented 1 year ago

I guess this issue is something unexpected so the log won't tell us more.

pjsgsy commented 1 year ago

Ah, got it. Thank you. Yes - If I look in the resource folder, it is there, correctly sized

image

So, the other client has been offline since the resync. I started it up and let the windows client fully re-sync (attachments set to 'Auto'), so download when click on note, then I tried to access the file again from that client. Looking in the web server log, I see this request for that file was being 404'ed. In debugging further, I find It was in fact because Joblin adds no file extension to the data file name and the web server does not know how to serve it correctly (no MIME type). Adding a MIME type mapping to the . extension resolved the issue. The files now serve!

Thank you for bearing with me and responding. Sometimes working through the process with a little help is all it needs!

If I may make a suggestion - Or a couple!

The first one - Custom or missing file extensions. I already ran into this issue for the md files. If you use non-standard file extensions some web servers will not serve the files as they do not have MIME type mappings for them. I appreciate this is likely too late to change, but I am sure it is a major hassle for people trying to figure out webdav sync!

The other one - Joplin is not handling HTTP response codes when fetching elements, it seems. The server was clearly giving out 404 responses. Joplin was considering that a valid document. This is, I think, an obvious error. Some sort of check here would probably have saved a lot of time and pointed to the issue a lot sooner.

Thank you all!

wh201906 commented 1 year ago

I'm glad to hear your problem being solved. As for the suggestions, maybe you could post them in the forum and more people will see them. https://discourse.joplinapp.org/

laurent22 commented 1 year ago

The log is incomplete and doesn't show the 404 error. Could you share the log that includes this error please?

For information, IIS is an awful implementation and we already have some hacks to support it. Maybe they broke it further in a recent version and we need more hacks, but we'll need the log for this. See here for example:

https://github.com/laurent22/joplin/blob/487112fd4d77500a3ecf699667ef038000968825/packages/lib/file-api-driver-webdav.js#L168

github-actions[bot] commented 1 year ago

Hey there, it looks like there has been no activity on this issue recently. Has the issue been fixed, or does it still require the community's attention? If you require support or are requesting an enhancement or feature then please create a topic on the Joplin forum. This issue may be closed if no further activity occurs. You may comment on the issue and I will leave it open. Thank you for your contributions.

github-actions[bot] commented 1 year ago

Closing this issue after a prolonged period of inactivity. If this issue is still present in the latest release, feel free to create a new issue with up-to-date information.