veliovgroup / Meteor-Files

🚀 Upload files via DDP or HTTP to ☄️ Meteor server FS, AWS, GridFS, DropBox or Google Drive. Fast, secure and robust.
https://packosphere.com/ostrio/files
BSD 3-Clause "New" or "Revised" License
1.11k stars 166 forks source link

HTTP uploads not working, multi-instance/cluster environment NGINX balanced #877

Closed sylido closed 11 months ago

sylido commented 12 months ago

Issue:

Other Info:

Looking at the source code I see that the requests don't rely just on the getUser function, the onBeforeUpload hook on the server side gets triggered and the check for the user that happens there returns null -> thus the upload cannot proceed, we don't have access to the original request that contains the cookie, so no custom validation can happen.

I think what's happening is that the WS that Meteor uses get routed to 1 instance, then when nginx gets the HTTP request for upload it doesn't get routed correctly to the same instance - is there a way to prevent that ?

Another possible solution that I see is to fully move the auth code to use the config.getUser logic before running the onBeforeUpload hook, this way we don't need to check for authorization there.

Third solution is being able to override the ROOT_URL environmental variable with something custom - i.e. the correct load balanced instance with the proper port.

With all that said simply switching to DDP works with multiple instances, as it uses the WS to make the request and that gets routed to the same instance, but sometimes it is slower than HTTP.

On the other hand HTTP works just fine on single instances, doing this locally in dev mode also seems to work, albeit the setup is a bit different as there is no http to https rerouting.

Thanks for any thoughts/feedback

dr-dimitru commented 12 months ago

Hello @sylido ,

Thank you for the detailed report. I may take some time to run tests simulating your environment.

On other hand the quick solution might be switching to ddp transport, have you discovered this option? Should not take much from your development. I see no reason why not to use ddp especially if it meets your case.

sylido commented 11 months ago

Hi @dr-dimitru,

yeah as I mention towards the end of my long description DDP does seem to work as it keeps the same WS open, but I remember DDP being slower compared to HTTP, we haven't noticed that now that we have switched to DDP. Was there an upgrade/change I missed that implemented some logic to fix that ? If that's the case, I think DDP would work.

Recently we've started experiencing similar failures when downloading files as well - it seems like the request randomly goes to the wrong instance - in this case we cannot switch to DDP afaik. What would be the best course of action in your mind ?

dr-dimitru commented 11 months ago

@sylido

yeah as I mention towards the end of my long description DDP does seem to work as it keeps the same WS open, but I remember DDP being slower compared to HTTP, we haven't noticed that now that we have switched to DDP. Was there an upgrade/change I missed that implemented some logic to fix that ? If that's the case, I think DDP would work.

It's just slightly slower; And depends from the case to case, if you need to upload files under 100mb and stability/reliability is your priority over speed, — use ddp. As user you probably won't notice a change.

Recently we've started experiencing similar failures when downloading files as well - it seems like the request randomly goes to the wrong instance - in this case we cannot switch to DDP afaik. What would be the best course of action in your mind ?

  1. Assign unique "service" domain names to each instance and store its details in the meta. Use it in hooks to adjust how links generated; OR
  2. the "best practice" would be to store files in the centralized or synchronized storage, where no matter what instance file was uploaded to, — all instances have access to all files; A good example are AWS:S3 and GridFS
dr-dimitru commented 11 months ago

@sylido lmk if this was solved, and in which way?

sylido commented 11 months ago

hi @dr-dimitru,

we had to resort to using one of the hackier solutions for now from https://github.com/veliovgroup/Meteor-Files/issues/737#issuecomment-1401458220

Not sure why, but even on a single instance that is not being load balanced by nginx, on a bundled production server we have problems where some users would randomly stop being able to download files, opening multiple tabs and closing them in different orders didn't seem to reproduce it. Waiting some time or trying a different browser would somehow break downloads and a user would be stuck on Access Denied ! which is coming from the protected function - the user would be empty.

I'm not sure what the long term solution for this and the upload through HTTP might be, I might make a PR if I get to the bottom of the x_mtok and how it's being used when I was assuming we could use a separate cookie and that would persist between tabs/instances and be able to identify the user that way.

Another thing that was suggested was to skip the interceptDownload logic and just build the file contents on the server, send them back to the client and generate an anchor tag dynamically + click on it for the user. This might be something that overrides your 1. suggestion from above as the problem is the user document missing rather than the file document - which already has some meta information.

I think 2. doesn't work for all our uses cases at this point, but we are looking into it.

Thanks for the help - closing this one for now !