Forceu / Gokapi

Lightweight selfhosted Firefox Send alternative without public upload. AWS S3 supported.
GNU Affero General Public License v3.0
1.72k stars 69 forks source link

Bug: No indication of upload to remote storage #193

Closed Forceu closed 2 months ago

Forceu commented 4 months ago

Discussed in https://github.com/Forceu/Gokapi/discussions/192

Originally posted by **MalteMagnussen** July 22, 2024 Trying to upload a large file (15GB in this case) to Gokapi causes it to "hang" at the end of the upload. ![image](https://github.com/user-attachments/assets/33c4cfcf-0105-4e4b-97f9-3c6db7a713d1) It has been several minutes. We are using S3 as the storage. We suspect it might be because S3 has to write to "disk/tape" first, and then it sends a link back, and Gokapi is waiting to receive that link, so it can create the row with the links etc. Is this a correct assumption? Looking at the logs doesn't provide any insights: ``` ██████  ██████   ██  ██  █████  ██████  ██  ██       ██    ██ ██  ██  ██   ██ ██   ██ ██  ██  ███ ██  ██ █████   ███████ ██████  ██  ██  ██ ██  ██ ██  ██  ██   ██ ██      ██   ██████   ██████  ██  ██ ██  ██ ██  ██                                        Gokapi v1.9.0 starting AWS login successful Saving new files to cloud storage Binding webserver to :53842 Webserver can be accessed at https://redacted/admin Press CTRL+C to stop Gokapi ``` For good measure, here are the `config.json` bits about the chunksize, etc: ``` "MaxMemory": 50, "UseSsl": false, "MaxFileSizeMB": 102400, "ChunkSize": 45, "MaxParallelUploads": 4, ``` And the container resources: ```yaml resources: limits: cpu: 750m memory: 1024Mi requests: cpu: 375m memory: 512Mi ``` Smaller sizes working: ![image](https://github.com/user-attachments/assets/1d1e113e-e624-46ac-be24-fe99dc2e9454)
Forceu commented 4 months ago

@MalteMagnussen Thanks for the report, I was able to reproduce it. Normally there should be an indication that the file is being processed / uploaded, I will look into it.

Forceu commented 4 months ago

Thanks, there was a problem with the implemented SSE protocol, which was used since the new release. Fixed in 4f71a2b

MalteMagnussen commented 4 months ago

@Forceu - Thank you very much. It is much appreciated.

The 15GB.file is showing up in the list of files upon visit to our gokapi instance today.

Would increasing any of the config give improved performance?
Could the CPU/MEM be a bottleneck, that if raised to a few GB each, would speed up the processing or upload?

Uploading a 20GB file, I am seeing "Processing..." now, and after a little over a minute "Uploading file..."

This is version 1.9.1

image

~40 minutes later, I choose to open the site in a new tab.

Holding the browser windows side-by-side, I can see that the 20GB file was successfully uploaded, but the UI where I uploaded the file from, is still saying "Uploading file..."

image

I create large files like so: dd if=/dev/zero of=20GB.file bs=1G count=20

Update: Trying locally with the same image in docker, unrestricted by CPU or MEM on a quite powerful ThinkPad, I can get it to finish the upload after a while. I'll try increasing the resources of the Kubernetes deployed container some more.

image

Do you think it is a CPU/MEM issue, or a local SSD vs remote S3 issue?

Update: I significantly increased the CPU and MEM to several GB in k8s, and trying 20GB again.

Still seeing the "Uploading...", yet I can see the file in the list of files, if I open it in a new tab.

image

Forceu commented 3 months ago

Sorry I just saw your last comment. Hm this should neither be a CPU nor a RAM limiting task, as by default all files larger than 50MB are written to a temporary file instead of being stored in RAM.

How long does the upload take? I can image that there is a timeout going on. Unfortunately the JS is not covered by the unit tests, which is why most bugs take place there.

MalteMagnussen commented 3 months ago

@Forceu

Testing a 5GB file upload:

Uploading at speeds between 12MB/s to 15MB/s according to the UI.

At around 5 minutes and 50 seconds, it was at 100% upload. That makes sense, given the upload speed.

Then it starts the "Processing" and "Uploading file" step.

Refreshing the page in a second window, I see the file at 7:30 on my stop watch. So it took ~1 minute and 40 seconds to process the chunks and upload to S3 I assume? Not the end of the world imo. Would just be nice to have the "Uploading file..." step end whenever the file is finished :)

image

The supporters in our company have started using the service, and the fact that it seems to "hang" on "Uploading file..." is confusing them. Even if they can just open the page in a new tab, and see the file there.


They have also mentioned that when uploading a file with name "Foo", it will always show up as "downloadFile" on the users side. Notice the filename in the UI is correct, but firefox is downloading it as "downloadFile". This is confusing for some customers, as they have trouble unzipping it, if it doesn't keep the .tar.gz extension fx.

image

Something about the Content-Disposition: attachment; is not being sent correctly.

Forceu commented 3 months ago

Thanks for the feedback! I will track the issue related to the content-disposition in #199. Can you test if this also happens if you do not proxy the file download? I could image that this is where the problem lays.

Normally the file should be added automatically after the upload is complete. I will have a look at it tomorrow

MalteMagnussen commented 3 months ago

@Forceu - Thank you very much :)

Let me check in our test environment what happens if I disable the proxying.

EDIT: Filename is saved here:

image

Yes, changing ProxyDownload to False will fix it. But we need the proxy, because our customers can't reach the S3 directly.

Thanks for the tip.

aws:
  Bucket: "bucket name"
  Region: us-east-1
  KeyId: REDACTED
  KeySecret: REDACTED
  Endpoint: REDACTED
  ProxyDownload: False

image

Forceu commented 3 months ago

@MalteMagnussen The incorrect headers are now fixed in the latest dev version. I will have a look at your problem that the new file is not displayed in time later on.

MalteMagnussen commented 3 months ago

@Forceu - Thank you very much.

It is working now in our test and prod environments.