RocketChat / Rocket.Chat

The communications platform that puts data protection first.
https://rocket.chat/
Other
40.55k stars 10.58k forks source link

Pictures corrupting when using remote network drive w. caching in fstab #19974

Open Angstroem opened 3 years ago

Angstroem commented 3 years ago

Description:

When using the mobile Rocket.Chat app on android or the PC app to send one or more pictures to the Rocket.Chat instance, some of the uploaded pictures are corrupted (e.g. one half of the picture is black/transparent). This behaviour could be observed from us on multiple android phones (including three samsung galaxy s10+, a galaxy s20, an older a-series phone, etc) as well as PCs. Also, this does happen to almost every fifth pictures, especially when sending multiple pictures at once (but single pictures were also affected from time to time).

Steps to reproduce:

  1. Select one or multiple pictures from the app or from the PC and send them to the server

Expected behavior:

All pictures should be uploaded correctly and with full integrity.

Actual behavior:

Some pictures corrupt as seen in this screenshot: grafik

When clicking on the image with the PC app or using the browser, the following can be observed: grafik

On the android app, the same behaviour is seen. Furthermore, when downloading the file itself to the PC and opening it up with a picture viewer, this mess is shown: grafik

This behaviour could also be observed when sharing pictures directly from the PC app/browser. File uploads on the other hand (for example a .zip-archive) seem to work just fine.

Server Setup Information:

Client Setup Information

Android app:

PC app:

Additional context

The instance does store all uploads (and custom emojis, etc) with the Storage Type FileSystem. The corresponding System Path is pointing to a folder which is mapped to a network drive using the static filesystem fstab. The user rocketchat has rwx permissions on this folder as seen in this picture: grafik

This issue shows the same behaviour as we are getting.

Relevant logs:

Logs from the browser console: grafik

Server logs (after updating to the newest version today):


I20201227-17:21:03.590(1) (migrations.js:120) Migrations: Migrating from version 209 -> 212
I20201227-17:21:03.597(1) (migrations.js:120) Migrations: Running up() on version 210
I20201227-17:21:03.614(1) (migrations.js:120) Migrations: Running up() on version 211
I20201227-17:21:05.022(1) (migrations.js:120) Migrations: Running up() on version 212
I20201227-17:21:05.042(1) (migrations.js:120) Migrations: Finished migrating.
I20201227-17:21:15.046(1) Not migrating, control is locked. Attempt 1/30. Trying again in 10 seconds. 
I20201227-17:21:15.054(1) (migrations.js:120) Migrations: Not migrating, already at version 212
I20201227-17:21:20.478(1) Loaded the Apps Framework and loaded a total of 1 Apps! 
I20201227-17:21:20.685(1) Using FileSystem for custom sounds storage 
I20201227-17:21:20.696(1) Using FileSystem for custom emoji storage 
I20201227-17:21:20.766(1) Updating process.env.MAIL_URL 
I20201227-17:21:23.155(1) ➔ System ➔ startup 
I20201227-17:21:23.156(1) ➔ +--------------------------------------+ 
I20201227-17:21:23.158(1) ➔ |            SERVER RUNNING            | 
I20201227-17:21:23.159(1) ➔ +--------------------------------------+ 
I20201227-17:21:23.160(1) ➔ |                                      | 
I20201227-17:21:23.161(1) ➔ |  Rocket.Chat Version: 3.9.3          | 
I20201227-17:21:23.162(1) ➔ |       NodeJS Version: 12.20.0 - x64  | 
I20201227-17:21:23.163(1) ➔ |      MongoDB Version: 4.0.21         | 
I20201227-17:21:23.164(1) ➔ |       MongoDB Engine: mmapv1         | 
I20201227-17:21:23.165(1) ➔ |             Platform: linux          | 
I20201227-17:21:23.167(1) ➔ |         Process Port: 3000           | 
I20201227-17:21:23.168(1) ➔ |             Site URL:                | 
I20201227-17:21:23.168(1) ➔ |     ReplicaSet OpLog: Enabled        | 
I20201227-17:21:23.169(1) ➔ |          Commit Hash: e47dc05618     | 
I20201227-17:21:23.170(1) ➔ |        Commit Branch: HEAD           | 
I20201227-17:21:23.172(1) ➔ |                                      | 
I20201227-17:21:23.172(1) ➔ +--------------------------------------+ ```
gabriellsh commented 3 years ago

Hi @Angstroem! If you access the folder where the uploads are saved, does the image appear normal? Also, if you upload the same image again, does it corrupt again?

The logs you sent from the browser seem to be browser's extensions errors (maybe some extension you had installed does not work with rocket.chat). It would be really nice to see if the server logs anything AFTER a corrupted image is uploaded.

I tried the FileSystem on the latest (using a local folder) and it seems to work fine (uploaded a couple dozen pictures). Since you're using a remove storage for the uploads, it looks like it can be a connection problem between the server OS and this remote storage.

Can you provide the additional info so we can have a better understanding of what is going on? Thanks!

Angstroem commented 3 years ago

Hello @gabriellsh, thanks for your fast reply!

First, i dont think that a network connection issue or bottleneck is the cause of this problem. Rocket.Chat is deployed in its own VM which is hosted together with a FreeNAS VM on the same server. The host server runs Windows Server 2016 with 2x4c/8t Xeon Processors and 128Gb of RAM (where 6Gb are allocated for the Rocket.Chat VM and 16Gb for the FreeNAS VM hosting the network shares). The Rocket.Chat VM also has its own dedicated user as well as dataset in FreeNAS.

Based on your suggestion i suspect the problem comes from the network share configuration itself. When i manually search for a corrupted picture in the network share and download the file to my local machine, it does not show as a corrupted image anymore when opening with a picture viewer. But downloading the same image through the Rocket.Chat client leads to the image being corrupted.

This screenshot, along with the following procedure, shows what i observed: Screenshot 2020-12-29 2050571

Procedure used:

  1. Download the corrupted picture through Rocket.Chat client and open it (the pink one above)
  2. Right-Click on the message -> Copy link address.
  3. Extract file name from link address. E.g. https://<instance_name>/file-upload/Qi7mG86an*********/20201227_131035.jpg -> Qi7mG86an*********
  4. Downloading the file Qi7mG86an********* from the network share (in the folder /file-uploads) and putting a .jpg at the end
  5. Opening the file in a picture viewer (the blue one above)

However, i may found a solution! I explicitly disabled file system caching in the Rocket.Chat VM static file system fstab. Here is the part of my fstab-configuration responsible for the network share that Rocket.Chat should use (uid 1001 is the user rocketchat):

# /etc/fstab 
# Rocket.Chat appdata network share
//<address_of_freenas_share>/rocket-chat /mnt/share cifs uid=1001,credentials=/<path_to_credential_file>/.smb_pwfile,cache=none 

Since then i was not able to reproduce the problem anymore and i am still on version 3.9.3. I tested over 50 different photos, not a single one corrupted. I will leave file system caching off as it fixes my problem for now, but it would be nice if someone with the same configuration as i have tries to reproduce what i discovered. There may be a bug somewhere as caching should not influence data integrity. But correct me if i am wrong.

Btw. there were no server-log entries even before the update to 3.9.3 indicating something wrong after a picture that corrupted was being uploaded to the Rocket.Chat.

Angstroem commented 3 years ago

FYI: When i disable caching again (removing cache=none from fstab), i can instantly reproduce the problem in my enviroment.

gabriellsh commented 3 years ago

Thanks a lot for your detailed explanation @Angstroem! I'll leave this issue open, but I ask: Can you rename it to more appropriately match the issue with caching? Thanks! I'll leave the issue open, since there may still be a bug.

Once again, thanks for your contribution!

Angstroem commented 3 years ago

Okay, done! Thanks for your support as well.

Here are some further information on my Linux VM that is running Rocket.Chat:

Linux Debian kernel version:

Linux RocketChat 4.19.0-13-amd64 #1 SMP Debian 4.19.160-2 (2020-11-28) x86_64

cifs-utils version:

Package: cifs-utils
Version: 2:6.8-2
Priority: optional
Section: otherosfs
Maintainer: Debian Samba Maintainers <pkg-samba-maint@lists.alioth.debian.org>
Installed-Size: 237 kB

Let me know if you need any more information from my instance.