Closed yuri-becker closed 2 months ago
@yuri-becker I guess we need more information
I assume this is some kind of permission issue. Is OCIS having issues with changing the user?
If that user has write permissions on the volumes/directories it shouldn't.
@rhafer Thanks for your reply.
Hm, there's tons of nats
related errors in the logs (timeout, warning about dropped messages). Something seems to be really weird there. Is that server under constant high CPU load or just when you're trying to upload?
Also, can you check what ocis storage-users uploads sessions
says? Are the any sessions stuck? Maybe it helps to clean them up. (See https://doc.owncloud.com/ocis/5.0/deployment/services/s-list/storage-users.html#sessions-command for details).
Could you by chance try if a clean setup of ocis 6.4.0 (using separate volumes) shows the same behavior on your system?
BTW, was the behavior broken from the beginning? Or could you sucessfully upload files at some point before?
Yea, OCIS is producing a high CPU load constantly without user interaction, possibly because of re-trying to delete uploads over and over. ocis storage-users uploads sessions
outputs me a ton of sessions. Clearing them didn't help.
On a fresh install, the upload worked after a few tries.
Yea, OCIS is producing a high CPU load constantly without user interaction, possibly because of re-trying to delete uploads over and over.
ocis storage-users uploads sessions
outputs me a ton of sessions. Clearing them didn't help.
Does that mean the sessions are still visible after ocis storage-users uploads sessions --clean
?
To get the nats servcie back into a sane state you could try to clear the nats
folder in you data volume while ocis is stopped. And then restart ocis, that will re-initialize the nats directory.
On a fresh install, the upload worked after a few tries.
What does "after a few tries" mean here? Which release is that fresh install using?
Also what kind of machine is this running on? How much RAM what kind of CPU, storage, etc.
Does that mean the sessions are still visible after
ocis storage-users uploads sessions --clean
?
The table was empty afterwards.
To get the nats servcie back into a sane state you could try to clear the
nats
folder in you data volume while ocis is stopped. And then restart ocis, that will re-initialize the nats directory.
That seemed to have fixed it temporarily. Once i try to upload a "larger" file (19MB), it goes back to being broken again.
What does "after a few tries" mean here? Which release is that fresh install using?
Fails twice, then upload succeeds on the third try. Then fails twice again and succeeds on the third try once more. That was only a <1MB file tho.
Also what kind of machine is this running on? How much RAM what kind of CPU, storage, etc.
I have 43 TB free storage, 126GB of RAM (of which 14GB is used) and a Xeon E5-1650 v3 (6 cores at 3.8 Ghz).
EDIT: The fresh install was using the same release, I'm running OCIS in Docker and it should pull the latest version daily.
EDIT: The fresh install was using the same release, I'm running OCIS in Docker and it should pull the latest version daily.
We've recently introduced the so-called rolling releases which we publish roughly every 3 weeks (see https://owncloud.dev/ocis/release_roadmap/#release-types) . They go to a different docker repo: https://hub.docker.com/r/owncloud/ocis-rolling it might be helpful to check if the issue still appears with that latest release from there (which is 6.4.0)
The issue still persists on 6.4.0.
Ok, thanks for trying.
Could you please set OCIS_LOG_LEVEL=debug
, start a clean ocis (ideally 6.4.0) instance, reproduce the upload issue and then attach the full log of that?
Done, funnily enough, on this install, I wasn't able to upload even small files. owncloud_logs_fresh_install.txt
Ok. I think that gets us a bit further I think. I can see a couple of successful uploads. But there is also at least one upload failing:
2024-09-25T15:16:38Z INF access-log | service=proxy proto=HTTP/1.1 request-id=a1c6abd7-29be-454b-9a52-3d26fac34430 traceid=b48f7713426e8f72d7e28fb5f67ef253 remote-addr=93.208.197.202 method=POST status=502 path=/remote.php/dav/spaces/c6d8a02e-77f9-409c-93e2-12b72659fa41$b69dbc42-e8bd-4e24-a626-4d6aae354d64 duration=56.566584 bytes=0 line=github.com/owncloud/ocis/v2/services/proxy/pkg/middleware/accesslog.go:34
2024-09-25T15:16:38Z DBG Decomposedfs: built session info | service=storage-users pkg=rgrpc traceid=b48f7713426e8f72d7e28fb5f67ef253 session={} line=github.com/cs3org/reva/v2@v2.24.1/pkg/storage/utils/decomposedfs/upload.go:305
2024-09-25T15:16:38Z INF file upload | service=storage-users pkg=rgrpc traceid=b48f7713426e8f72d7e28fb5f67ef253 data-server=http://localhost:9158/data/simple/26ae9dd0-ea47-4ef3-bbeb-0538c052aa56 fn=./small_file.stl xs=map[md5:100 unset:1000] line=github.com/cs3org/reva/v2@v2.24.1/internal/grpc/services/storageprovider/storageprovider.go:470
2024-09-25T15:16:38Z INF file upload | service=storage-users pkg=rgrpc traceid=b48f7713426e8f72d7e28fb5f67ef253 data-server=http://localhost:9158/data/tus/26ae9dd0-ea47-4ef3-bbeb-0538c052aa56 fn=./small_file.stl xs=map[md5:100 unset:1000] line=github.com/cs3org/reva/v2@v2.24.1/internal/grpc/services/storageprovider/storageprovider.go:470
2024-09-25T15:16:38Z DBG unary | service=storage-users pkg=rgrpc traceid=b48f7713426e8f72d7e28fb5f67ef253 user-agent=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36 from=tcp://127.0.0.1:49880 uri=/cs3.storage.provider.v1beta1.ProviderAPI/InitiateFileUpload start=25/Sep/2024:15:16:38 +0000 end=25/Sep/2024:15:16:38 +0000 time_ns=61016562 code=OK line=github.com/cs3org/reva/v2@v2.24.1/internal/grpc/interceptors/log/log.go:69
2024-09-25T15:16:38Z DBG unary | service=gateway pkg=rgrpc traceid=b48f7713426e8f72d7e28fb5f67ef253 user-agent=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36 from=tcp://127.0.0.1:51788 uri=/cs3.gateway.v1beta1.GatewayAPI/InitiateFileUpload start=25/Sep/2024:15:16:38 +0000 end=25/Sep/2024:15:16:38 +0000 time_ns=62587837 code=OK line=github.com/cs3org/reva/v2@v2.24.1/internal/grpc/interceptors/log/log.go:69
2024-09-25T15:16:38Z ERR error doing PATCH request to data gateway | service=ocdav name=com.owncloud.web.ocdav traceid=b48f7713426e8f72d7e28fb5f67ef253 request-id=a1c6abd7-29be-454b-9a52-3d26fac34430 spaceid=c6d8a02e-77f9-409c-93e2-12b72659fa41$b69dbc42-e8bd-4e24-a626-4d6aae354d64 path=/ filename=small_file.stl error=Patch "https://cloud.catboy.house/data/eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJhdWQiOiJyZXZhIiwiZXhwIjoxNzI3MzYzNzk4LCJpYXQiOjE3MjcyNzczOTgsInRhcmdldCI6Imh0dHA6Ly9sb2NhbGhvc3Q6OTE1OC9kYXRhL3R1cy8yNmFlOWRkMC1lYTQ3LTRlZjMtYmJlYi0wNTM4YzA1MmFhNTYifQ.25taQcy58gE_SF8KzQzInVeAeaptFimE2u_gWLHoJy8": context canceled line=github.com/cs3org/reva/v2@v2.24.1/internal/http/services/owncloud/ocdav/tus.go:298
As part of the TUS upload processing ocis
needs to send a PATCH request to the so-called data gateway. It's sending that to the public ocis URL currently. So that request will go through your reverse proxy (caddy, I believe). For some (yet unknown) reason. The request is never received by ocis for this specific upload, at least there is no correlating access-log message for the PATCH request with a matching request-id
. So it somehow got lost between ocis and the reverse proxy.
Do you have any logs from your reverse proxy from around that time? If not, you might wanna increase the log level of the reverse proxy and reproduce the issue once more.
I havn't been logging but enabled logging and re-tried uploads. Here you go. Thanks again for all your help. cloud.catboy.house-access.log
@yuri-becker Which caddy release are you using? And why are you setting:
request_buffers 256k
response_buffers 256k
Does it work without those, by chance? (The docs say those should only be used if really really needed)
@yuri-becker Which caddy release are you using? And why are you setting:
request_buffers 256k response_buffers 256k
Does it work without those, by chance? (The docs say those should only be used if really really needed)
@rhafer Without these, everything seems to work as it should. Thank you for your help. I would be curious if the same issue exists on nginx, but I might also have not set something in Caddy that I should have with this.
Describe the bug
File uploads via the web ui fail and the server has a high CPU usage without any user activity.
Steps to reproduce
Try to upload a file via the web ui.
Expected behavior
The upload succeedes and the file is available.
Actual behavior
Upload fails with
Unknown Error
. For other users, the upload seemes to have succeded, but the file is grayed out and not available. The log the contains entries like this:Also, this log keeps popping up
Setup
The server is managed via Ansible. The container setup is this:
```yaml - name: 🚀 Starts container community.docker.docker_container: name: "{{ owncloud_container_name }}" image: owncloud/ocis state: started restart_policy: unless-stopped ports: - "{{ docker_gateway }}:{{ owncloud_service_port }}:9200" entrypoint: - /bin/sh command: ["-c", "ocis init || true; ocis server"] env: OCIS_INSECURE: "false" OCIS_URL: "https://{{ owncloud_domain }}" PROXY_HTTP_ADDR: "0.0.0.0:9200" PROXY_TLS: "false" OCIS_LOG_LEVEL: "warn" OCIS_ADD_RUN_SERVICES: "notifications" NOTIFICATIONS_SMTP_HOST: "{{ smtp_server }}" NOTIFICATIONS_SMTP_PORT: "{{ smtp_port }}" NOTIFICATIONS_SMTP_SENDER: "{{ owncloud_smtp_sender }}" NOTIFICATIONS_SMTP_USERNAME: "{{ owncloud_smtp_user }}" NOTIFICATIONS_SMTP_PASSWORD: "{{ owncloud_smtp_password }}" NOTIFICATIONS_SMTP_INSECURE: "false" user: "{{ owncloud_uid }}:{{ app_gid }}" volumes: - "{{ owncloud_volume_config }}:/etc/ocis" - "{{ owncloud_volume_data }}:/var/lib/ocis" ```
The reverse proxy (caddy) has this setup for owncloud:
```caddy cloud.catboy.house { reverse_proxy 172.17.0.1:28946 { request_buffers 256k response_buffers 256k } } ```
The complete setup is here: https://github.com/yuri-becker/catboy-house/tree/main/roles/owncloud
Additional context
I assume this is some kind of permission issue. Is OCIS having issues with changing the user? Uploading via the client seems to work, or seemed to have worked.