jstaf / onedriver

A native Linux filesystem for Microsoft OneDrive
GNU General Public License v3.0
1.97k stars 96 forks source link

Corruptions when copying/accessing file #222

Closed 404dcd closed 1 year ago

404dcd commented 2 years ago

This pretty much says it all.

➜ ~/OneDrive $ md5sum Render.mp4
3bb868fed4d95bb2b4048a66f3f32f13  Render.mp4
➜ ~/OneDrive $ md5sum Render.mp4
012ccee7467f84b577ac44c312ff5a52  Render.mp4
➜ ~/OneDrive $ md5sum Render.mp4
c12b17e7c656a4269f289ea4b4e4c661  Render.mp4

The file size also differs each time, the original is about 27M but it copies anywhere from 10 to 20 of those megabytes. Here's the relevant portion of the log: logs.txt From a little prodding, it looks likes what it copies is correct, just doesn't copy all of it. I'm using onedriver v0.12.0.

Bahnschrift commented 2 years ago

Same issue here with similar logs. Possibly the same issue as #200?

benkinooby commented 2 years ago

Hi,

first off, thank you for creating and sharing your work.

I am hit by the same issue and have matching log entries as well.

I am using a 10 MB Microsoft PowerPoint presentation as example, calculating it's md5sum three times with different results each time.

user@computer ~/oneDriverMountDir $ md5sum Presentation.pptx 
457840920c16b31b142f82d13e2565fb  Presentation.pptx
user@computer ~/oneDriverMountDir $ md5sum Presentation.pptx 
19da4f0bb413e8503fc37b4da01e2232  Presentation.pptx
user@computer ~/oneDriverMountDir $ md5sum Presentation.pptx 
c48cc9971c1fbe7b2975285dae52b86e  Presentation.pptx

I set logging level to trace.

Here is the output (I removed some parts of the log and marked those gaps with [...]):

14:58:33 INF Fetching remote content for item from API. id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 op=Open path=/Presentation.pptx                                                                                                                             
14:58:35 TRC Fetching deltas from server.                                                                                                                                                                                                                               
14:58:36 INF Fetched 0 deltas.                                                                                                                                                                                                                                          
14:58:36 DBG Serializing cache metadata to disk.                                                                                                                                                                                                                        
14:58:48 TRC  id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 op=GetAttr path=/Presentation.pptx                                                                                                                                                                    
14:58:48 TRC  bufsize=65536 fileSize=10104832 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=0 op=Read originalBufsize=65536 path=/Presentation.pptx                                                                                                        
14:58:48 TRC  bufsize=131072 fileSize=10104832 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=65536 op=Read originalBufsize=131072 path=/Presentation.pptx                                                                                                  
14:58:48 TRC  bufsize=131072 fileSize=10104832 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=327680 op=Read originalBufsize=131072 path=/Presentation.pptx                                                                                                 
14:58:48 TRC  bufsize=131072 fileSize=10104832 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=196608 op=Read originalBufsize=131072 path=/Presentation.pptx                                                                                                 
14:58:48 TRC  bufsize=131072 fileSize=10104832 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=458752 op=Read originalBufsize=131072 path=/Presentation.pptx                                                                                                 
14:58:48 TRC  bufsize=131072 fileSize=10104832 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=589824 op=Read originalBufsize=131072 path=/Presentation.pptx                                                                                                 
14:58:48 TRC  bufsize=131072 fileSize=10104832 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=720896 op=Read originalBufsize=131072 path=/Presentation.pptx
[...]
14:58:48 TRC  bufsize=131072 fileSize=10104832 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=9502720 op=Read originalBufsize=131072 pat
14:58:48 TRC  bufsize=131072 fileSize=10104832 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=9633792 op=Read originalBufsize=131072 pat
14:58:48 TRC  bufsize=131072 fileSize=10104832 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=9764864 op=Read originalBufsize=131072 pat
14:58:48 TRC  bufsize=131072 fileSize=10104832 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=9895936 op=Read originalBufsize=131072 pat
14:58:48 TRC  bufsize=77824 fileSize=10104832 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=10027008 op=Read originalBufsize=77824 path
14:58:48 TRC  bufsize=0 fileSize=10104832 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=10104832 op=Read originalBufsize=4096 path=/Fol
14:58:48 DBG  id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 op=Flush path=/Presentation.pptx
14:58:48 DBG  id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 op=Fsync path=/Presentation.pptx
14:59:06 TRC Fetching deltas from server.
14:59:06 INF Fetched 0 deltas.
14:59:06 DBG Serializing cache metadata to disk.
14:59:23 TRC  id=015HHZSYN6Y2GOVW7725BZO354PWSELRRZ name=Presentation.pptx nodeID=1 op=Lookup
14:59:23 DBG  id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 op=Open path=/Presentation.pptx
14:59:23 INF Not using cached item due to file hash mismatch. drivetype=business id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 op=Open path=/Presentation.pptx
14:59:23 INF Fetching remote content for item from API. id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 op=Open path=/Presentation.pptx
14:59:36 TRC Fetching deltas from server.
14:59:36 INF Fetched 0 deltas.
14:59:36 DBG Serializing cache metadata to disk.
14:59:38 TRC  id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 op=GetAttr path=/Presentation.pptx
14:59:38 TRC  bufsize=65536 fileSize=10022912 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=0 op=Read originalBufsize=65536 path=/Presentation.pptx
14:59:38 TRC  bufsize=131072 fileSize=10022912 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=196608 op=Read originalBufsize=131072 path=/Presentation.pptx
14:59:38 TRC  bufsize=131072 fileSize=10022912 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=65536 op=Read originalBufsize=131072 path=/Presentation.pptx
14:59:38 TRC  bufsize=131072 fileSize=10022912 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=327680 op=Read originalBufsize=131072 path=/Presentation.pptx
14:59:38 TRC  bufsize=131072 fileSize=10022912 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=458752 op=Read originalBufsize=131072 path=/Presentation.pptx
14:59:38 TRC  bufsize=131072 fileSize=10022912 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=589824 op=Read originalBufsize=131072 path=/Presentation.pptx
14:59:38 TRC  bufsize=131072 fileSize=10022912 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=851968 op=Read originalBufsize=131072 path=/Presentation.pptx
14:59:38 TRC  bufsize=131072 fileSize=10022912 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=720896 op=Read originalBufsize=131072 path=/Presentation.pptx
14:59:38 TRC  bufsize=131072 fileSize=10022912 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=983040 op=Read originalBufsize=131072 path=/Presentation.pptx
14:59:38 TRC  bufsize=131072 fileSize=10022912 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=1114112 op=Read originalBufsize=131072 path=/Presentation.pptx
14:59:38 TRC  bufsize=131072 fileSize=10022912 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=1245184 op=Read originalBufsize=131072 path=/Presentation.pptx
14:59:38 TRC  bufsize=131072 fileSize=10022912 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=1376256 op=Read originalBufsize=131072 path=/Presentation.pptx
14:59:38 TRC  bufsize=131072 fileSize=10022912 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=1507328 op=Read originalBufsize=131072 path=/Presentation.pptx
[...]
14:59:38 TRC  bufsize=131072 fileSize=10022912 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=8454144 op=Read originalBufsize=131072 path=/Presentation.pptx
14:59:38 TRC  bufsize=131072 fileSize=10022912 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=8585216 op=Read originalBufsize=131072 path=/Presentation.pptx
14:59:38 TRC  bufsize=131072 fileSize=10022912 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=8716288 op=Read originalBufsize=131072 path=/Presentation.pptx
14:59:38 TRC  bufsize=131072 fileSize=10022912 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=8847360 op=Read originalBufsize=131072 path=/Presentation.pptx
14:59:38 TRC  bufsize=131072 fileSize=10022912 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=8978432 op=Read originalBufsize=131072 path=/Presentation.pptx
14:59:38 TRC  bufsize=131072 fileSize=10022912 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=9240576 op=Read originalBufsize=131072 path=/Presentation.pptx
14:59:38 TRC  bufsize=131072 fileSize=10022912 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=9109504 op=Read originalBufsize=131072 path=/Presentation.pptx
14:59:38 TRC  bufsize=131072 fileSize=10022912 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=9371648 op=Read originalBufsize=131072 path=/Presentation.pptx
14:59:38 TRC  bufsize=131072 fileSize=10022912 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=9502720 op=Read originalBufsize=131072 path=/Presentation.pptx
14:59:38 TRC  bufsize=131072 fileSize=10022912 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=9633792 op=Read originalBufsize=131072 path=/Presentation.pptx
14:59:38 TRC  bufsize=131072 fileSize=10022912 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=9764864 op=Read originalBufsize=131072 path=/Presentation.pptx
14:59:38 TRC  bufsize=126976 fileSize=10022912 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=9895936 op=Read originalBufsize=126976 path=/Presentation.pptx
14:59:38 TRC  bufsize=0 fileSize=10022912 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=10022912 op=Read originalBufsize=4096 path=/Presentation.pptx
14:59:38 DBG  id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 op=Flush path=/Presentation.pptx
14:59:38 DBG  id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 op=Fsync path=/Presentation.pptx
15:00:06 TRC Fetching deltas from server.
15:00:06 INF Fetched 0 deltas.
15:00:06 DBG Serializing cache metadata to disk.
15:00:21 TRC  id=015HHZSYN6Y2GOVW7725BZO354PWSELRRZ name=Presentation.pptx nodeID=1 op=Lookup
15:00:21 DBG  id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 op=Open path=/Presentation.pptx
15:00:21 INF Not using cached item due to file hash mismatch. drivetype=business id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 op=Open path=/Presentation.pptx
15:00:21 INF Fetching remote content for item from API. id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 op=Open path=/Presentation.pptx
15:00:36 TRC  id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 op=GetAttr path=/Presentation.pptx
15:00:36 TRC  bufsize=65536 fileSize=10285056 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=0 op=Read originalBufsize=65536 path=/Presentation.pptx
15:00:36 TRC  bufsize=131072 fileSize=10285056 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=65536 op=Read originalBufsize=131072 path=/Presentation.pptx
15:00:36 TRC  bufsize=131072 fileSize=10285056 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=196608 op=Read originalBufsize=131072 path=/Presentation.pptx
15:00:36 TRC  bufsize=131072 fileSize=10285056 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=327680 op=Read originalBufsize=131072 path=/Presentation.pptx
15:00:36 TRC  bufsize=131072 fileSize=10285056 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=458752 op=Read originalBufsize=131072 path=/Presentation.pptx
15:00:36 TRC  bufsize=131072 fileSize=10285056 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=589824 op=Read originalBufsize=131072 path=/Presentation.pptx
15:00:36 TRC  bufsize=131072 fileSize=10285056 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=851968 op=Read originalBufsize=131072 path=/Presentation.pptx
15:00:36 TRC  bufsize=131072 fileSize=10285056 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=720896 op=Read originalBufsize=131072 path=/Presentation.pptx
15:00:36 TRC  bufsize=131072 fileSize=10285056 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=1114112 op=Read originalBufsize=131072 path=/Presentation.pptx
15:00:36 TRC  bufsize=131072 fileSize=10285056 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=983040 op=Read originalBufsize=131072 path=/Presentation.pptx
15:00:36 TRC  bufsize=131072 fileSize=10285056 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=1245184 op=Read originalBufsize=131072 path=/Presentation.pptx
15:00:36 TRC  bufsize=131072 fileSize=10285056 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=1376256 op=Read originalBufsize=131072 path=/Presentation.pptx
15:00:36 TRC  bufsize=131072 fileSize=10285056 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=1507328 op=Read originalBufsize=131072 path=/Presentation.pptx
15:00:36 TRC  bufsize=131072 fileSize=10285056 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=1638400 op=Read originalBufsize=131072 path=/Presentation.pptx
15:00:36 TRC  bufsize=131072 fileSize=10285056 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=1900544 op=Read originalBufsize=131072 path=/Presentation.pptx
15:00:36 TRC  bufsize=131072 fileSize=10285056 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=1769472 op=Read originalBufsize=131072 path=/Presentation.pptx
[...]
15:00:36 TRC  bufsize=131072 fileSize=10285056 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=9633792 op=Read originalBufsize=131072 path=/Presentation.pptx
15:00:36 TRC  bufsize=131072 fileSize=10285056 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=9764864 op=Read originalBufsize=131072 path=/Presentation.pptx
15:00:36 TRC  bufsize=131072 fileSize=10285056 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=10027008 op=Read originalBufsize=131072 path=/Presentation.pptx
15:00:36 TRC  bufsize=131072 fileSize=10285056 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=9895936 op=Read originalBufsize=131072 path=/Presentation.pptx
15:00:36 TRC  bufsize=126976 fileSize=10285056 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=10158080 op=Read originalBufsize=126976 path=/Presentation.pptx
15:00:36 TRC  bufsize=0 fileSize=10285056 id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 offset=10285056 op=Read originalBufsize=4096 path=/Presentation.pptx
15:00:36 DBG  id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 op=Flush path=/Presentation.pptx
15:00:36 DBG  id=015HHZSYKI5RVLIHBWWZFLYJ3XXP3B6LAU nodeID=34 op=Fsync path=/Presentation.pptx
15:00:36 TRC Fetching deltas from server.
15:00:36 INF Fetched 0 deltas.
15:00:36 DBG Serializing cache metadata to disk.

Note how fileSize changes each run and how it states Not using cached item due to file hash mismatch.

abraunegg commented 2 years ago

@benkinooby Is your account a Business Account or is the data shared on SharePoint?

If yes - beware of the modifications Microsoft make when uploading files using those backends - as Microsoft adds metadata thus changes the file - thus the hash changes. Please read: https://github.com/OneDrive/onedrive-api-docs/issues/935

benkinooby commented 2 years ago

@abraunegg it's a oneDrive for Business account. Thank you very much for the hint.

w.r.t. https://github.com/OneDrive/onedrive-api-docs/issues/935 For my understanding: Even if the file is modified by meta data, shouldn't the md4sum stay consistent once the file was "enriched"? I get a different md5sum at every try.

Also, the md5sum was just a troubleshooting step after having having incomplete downloads for files bigger than ca. 9MB (same issues as reporter of this issue).

404dcd commented 1 year ago

Fixed in v0.13.0.

jstaf commented 1 year ago

Thanks for following up!