Open solracsf opened 4 years ago
Your issue is likely filelocking. Disable it in the nextcloud config, disable redis filelocking. Restart PHP-FPM and try reproducing this again.
In my case by disabling filelocking all of my issues related to deletion were resolved. I just let the S3 backend handle the filelocking now.
Filelocking is already disabled (see my config in the 1st post).
Filelocking is already disabled (see my config in the 1st post).
My bad, another foot in mouth moment. If you wait a while are they removed from the backend? Sometimes with S3 deletion is delayed on the backend.
Thanks but I don't think so as if i upload a 200M file and delete it, i can see it in real time in the S3 backend. And 2h had passed now and files are there (cron is running every 5mn).
Thanks but I don't think so as if i upload a 200M file and delete it, i can see it in real time in the S3 backend. And 2h had passed now and files are there (cron is running every 5mn).
Right but with S3 in particular if a file is removed but is locked on the S3 backend it can take a while for it to process the deletions. 2 hours is a fairly long time though.
If you have your S3 provider run the garbage collection process do the files stay or are they deleted?
With amazon they also include an option to retain locked objects for x days.
https://docs.aws.amazon.com/AmazonS3/latest/dev/object-lock-overview.html https://aws.amazon.com/blogs/storage/protecting-data-with-amazon-s3-object-lock/
Can you verify that's not the case and garbage collection doesn't resolve the issue?
I'm not using Amazon but Scaleway, they have Lifecycle Rules but they are disable by default. What do you call GC in S3?
I'm not using Amazon but Scaleway, they have Lifecycle Rules but they are disable by default. What do you call GC in S3?
I use radosgw-admin gc process. Each host has their own rules for garbage collection. Do they have a end-user option for garbage collection or an API call to process it?
If not you would need to contact them directly and ask how often it runs and if they can run it now.
Edit: Scaleway runs it once a day on their cold S3 storage. I don't know about the other storage options - best contact them about it.
I owe you an apology. I don't use nextcloud for images but for file storage. You are correct that files are not being deleted properly when it comes to image previews.
Same here with wasabi as storage backend. I am having many problems with s3 currently. Maybe there is something more general broken.
I can always confirm this with v19.0.5. My test instance is completely empty, no files at all, trashbin cleaned, but mc outputs this:
./mc du minio/bucket
2.7GiB
and ./mc ls minio/bucket
lists hundreds of files from my different tests.
Some of the files were created more than one month ago in the bucket.
These are clearly not images previews as i have big files in the bucket:
./mc ls minio/bucket
...
[2020-11-24 20:16:54 CET] 115KiB urn:oid:50503
[2020-10-07 10:10:30 CEST] 1.1KiB urn:oid:15082
[2020-11-24 20:38:10 CET] 99KiB urn:oid:55762
[2020-10-07 10:09:26 CEST] 5.5MiB urn:oid:14773
[2020-11-24 20:38:09 CET] 192KiB urn:oid:55750
[2020-10-07 09:59:00 CEST] 26KiB urn:oid:11050
[2020-10-07 10:10:27 CEST] 110B urn:oid:15034
[2020-10-06 11:21:30 CEST] 360KiB urn:oid:883
[2020-11-24 20:21:33 CET] 271KiB urn:oid:54307
[2020-10-07 09:59:52 CEST] 25MiB urn:oid:11158
...
Summary: an empty instance, and a bucket with 2.87GB
used and 3685
objects in it.
😮
Yes, can confirmed this issue with Nextcloud 20.0.1 also. My steps:
Now my minio bucket is having 10 x 10mb chunked file which should've been deleted.
I've tested it with AWS S3, to eliminate any compatibility issues with S3 'compatible' providers. Object lock is disabled for this test case.
Problem remains.
Can't delete folder
- nothing in logs about this - has showed in WebUI for 2 of them but after refresh the Files tab, it shows an empty root)php occ trashbin:clean --all-users
)Check the bucket stats of the files empty NC instance:
./mc du --versions aws/bucket
2.9GiB
This is a serious problem for many reasons, specially GDPR when users request their files to be deleted and they aren't, beyond the S3 billing for objects we aren't using anymore.
cc @nextcloud/server-triage can someone take a look at this? I believe this affects every ObjectStorage instance but since files are named urn:oid:xxx
nobody really knows what files are on their buckets.
+1 for GDPR concerns.
It has been almost a year since this issue was brought up and S3 is heavily used in enterprise environments - any ideas when this will be prioritized? Unfortunately, we cannot rely on using S3 for storage if we cannot show that files are completely removed.
I have a suggestion on this issue.
I know it is hard to sync between files (including filesystem or object storage) and database, not mention about handling cache involed. I believe it is impossible to make database correct after a hardware-based failure, just like a simple power failure. So I suggest there should be a way to check current file list and file information at database.
There is a command occ files:scan
to sync between files and database in filesystem based storage condition, but this is not applicable in settings using object as primary storage. I believe every server using object stroage always using a standalone bucket (or directory), so it is safe to clean uncontrolled or unregistered files.
I also appreciate if developers take a look at object server direct download function, an issue had opened at #14675. This function can save our server non-essential bandwitdh, and server loading.
I use object storage (minio) as primary storage because files can be backup easily. I don't need to shutdown Nextcloud server for a long time, and I can seperate database and file server easily. I believe this workout will be widely used in enterprise level, I also wish this suggestion can help Nextcloud server at deployment and migration.
I am having the same issue running Nextcloud 21.0.2 with Digital Ocean Spaces (S3) as primary storage. In my case it seems that the issue only occurs when server-side encryption is activated. Although, I haven't tested too much without encryption so I can't be too conclusive.
Also, I agree with @caretuse It would be much appreciated if both of these features could be implemented in some future release
There is a command
occ files:scan
to sync between files and database in filesystem based storage condition, but this is not applicable in settings using object as primary storage. I believe every server using object stroage always using a standalone bucket (or directory), so it is safe to clean uncontrolled or unregistered files.I also appreciate if developers take a look at object server direct download function, an issue had opened at #14675. This function can save our server non-essential bandwitdh, and server loading.
Yes, can confirmed this issue with Nextcloud 20.0.1 also. My steps:
- upload a 500mb files.
- cancel the upload at 100mb.
Now my minio bucket is having 10 x 10mb chunked file which should've been deleted.
I have been running Nextcloud using S3 storage for over two years. I noticed my bucket was bloating early on. Digital Ocean shows my S3 was using 800GB even though my only user had 218 GB of files including versioning. I've been watching this issue for a long time now hoping for a solution, but finally got around to looking into it myself.
I compared the s3cmd la
file list output to the oc_filecache
database table. I expected to find extra trash files in the S3 bucket. I was perplexed to find that the file list matched perfectly. This was easy to check as the database fileid
is the urn:oid:___
number. The file size from the data also matched. This led me to research more about S3 storage.
I finally found that the bloat was from old incomplete uploads. You can list these using the s3cmd multipart s3://BUCKET/
command. S3 allows large uploads to be uploaded as smaller multipart files which are then concatenated when the upload is complete. This helps reduce data transfer in case of interrupted uploads as it can resume near where it left off. It appears neither Nextcloud nor S3 storage is set to delete old incomplete multipart uploads by default. You can remove each file set individually using s3cmd abortmp s3://BUCKET/FILENAME UPLOAD_ID
.
Nextcloud could remove old multipart data if it kept track of them. But S3 has the ability to do so on its own. Using s3cmd you upload an XML rule to the S3 bucket:
s3cmd setlifecycle lifecycle.xml s3://BUCKET/
Where lifecycle.xml is:
<LifecycleConfiguration>
<Rule>
<ID>Remove uncompleted uploads</ID>
<Prefix/>
<Status>Enabled</Status>
<AbortIncompleteMultipartUpload>
<DaysAfterInitiation>3</DaysAfterInitiation>
</AbortIncompleteMultipartUpload>
</Rule>
</LifecycleConfiguration>
This rule will run once a day at midnight UTC according to what I found. After waiting a day my nearly 800 incomplete uploads spanning over two years were gone and my S3 storage now sits at 220GB as it should.
This doesn't appear to be the solution to all of the issues in this thread, but hopefully it helps some. In my case the files marked as trash or versioning in the database were being removed correctly according to the rules I have in Nextclouds config file. I have transactional file locking disabled and encryption is not enabled.
I suppose this is still happening on NC21.0.4?
I suppose this is still happening on NC21.0.4?
I use Nextcloud 22.0.1 and have the exact same problem with Scaleway S3. At this point I have about 35 GB used by my users, but storage is filled with 74 GB.
Edit:
Manually running command: ./occ trashbin:clean --all-users
has fixed it for me, but I guess problem will return in time.
Manually running command:
./occ trashbin:clean --all-users
has fixed it for me
Looks like the original issue is fixed then.
@acsfer can you still reproduce this on NC21.0.4 or NC22.1.1?
@szaimen can't help anymore here, we moved away from S3...
I can confirm, problem exists, even after upgrading to latest (22.1.1) version.
./occ trashbin:clean --all-users
didn't help
In interface I see: 146.7Gb is used, and minio shows 961Gb for this user and his bucket.
This issue has been automatically marked as stale because it has not had recent activity and seems to be missing some essential information. It will be closed if no further activity occurs. Thank you for your contributions.
I've a similar issue with my S3 bucket. But for me it's most likely caused by partially failed multipart uploads, where nextcloud didn't clean up the junks it already pushed into the S3 after the upload failed (other issue #29516).
same issue here - basiclly - i suggest everyone avoid using s3 as primary storage unless you want to throw money out the window.
I snooped around the Nextcloud database and it seems that the issue is, that objects uploaded to S3 are not committed to the db until the transfer to S3 is completed. If a transfer is interrupted, then Nextcloud looses track of the object, since no record of ongoing transfers is kept.
A potential fix could be to log ongoing transfers in the database and occasionally do a clean-up if something goes wrong.
Until this is fixed Nextcloud will continue to bloat the bucket, so I've hacked together a python script that cleans up the S3 storage. It doesn't solve any of the open issues using S3 as primary storage - it simply cleans up orphaned objects in the bucket, thereby bringing down the amount of storage used by Nextcloud.
DISCLAIMER: I'm a stranger on the internet providing a script, that requires access to your personal (and probably sensitive data) -> Do not trust strangers on the internet. Please review the code before running it. I'm not responsible if this script destroys your data, corrupts your db, makes your house catch fire, or curse you to step on Lego bricks every time you have bare feet.
Since the issue seems to be caused by the db not being updated until a transfer to S3 is complete, the script might delete objects that have successfully been transferred to S3, but have not yet been recorded in the database, if it's run while a sync is in progress. Therefore you should not run this while a sync is in progress. I repeat Do not run this while a sync is in progress!
I've run/tested this against my own setup (Minio + Postgres) and haven't encountered any issues so far. If you use any other combination of S3 compatible storage and database, you'll need to modify the code to your needs:
Why reinvent the wheel? Look back at my post on July 4 and S3 lifecycle rules. Since then I have had zero issues with S3 storage bloating from NC 20 through 23. (https://github.com/nextcloud/server/issues/20333#issuecomment-873608624)
@Scandiravian made a good script to solve database not consistent issue, although I believe this should be implemented in occ trashbin:cleanup --all-users
, just like NeoTheThird mentioned in #29841.
@jeffglancy and otherguy also made a good script to solve another issue, which is cleanup pending multipart uploads in S3. But lazy as me, I would choose rclone cleanup s3:bucket
in rclone document, rather simple and mistake-proof solution.
Hi, please update to 24.0.8 or better 25.0.2 and report back if it fixes the issue. Thank you!
I tested some scenario
occ files:scan --all
, until delete manual from NextcloudI can't confirm files not shown in Nextcloud scenario, it should manipulate in database level. It goes beyond my interest.
Does anyone have environment to test?
Hi @szaimen ,
we are currently running Nextcloud 25 and still experience this problem. On one instance, our S3 Bucket shows 209GB of data, while counting the different users quota in NC itself comes to about 55GB.
select sum(size/1024/1024/1024) as size_GB, count(*) as anzahl from oc_filecache where mimetype != 2;
shows around 203GB of data which is tracked by NC
Trashbin is (almost) empty (~2GB)
Occurs on different Instances, which were built with a custom Docker Image.
@Corinari
I created a script (S3->local) once upon a time I had trouble with S3.. partially because I found a bug and feared it was S3 related (but wasn't, fixed that one: https://github.com/nextcloud/server/issues/34422 ;) ) later on (partially with creating that migration script), I dared to try to migrate back to S3.. "reversing" that script was quite a challenge.. but I got it working.. With creating that script I built in various "sanity checks".. and I now run my "local->S3" script every now and then to clean up my S3.. and baring a little hiccup every now and then the script rarely needs to clean stuf up..
A few weeks ago I decided to publish it on Github, take a look at: https://github.com/mrAceT/nextcloud-S3-local-S3-migration
PS: I have various users on my Nextcloud, totaling some 100+Gb of data
I wrote a Python script to delete orphaned S3 objects (among other work-arounds for NC lack of proper S3 support): https://github.com/aurelienpierre/clean-nextloud-s3
Is there already a real solution from nextcloud? Facing the same problem currently.
@aurelienpierre you're script might help but its not ready for other s3 vendos like OVH Cloud https://github.com/aurelienpierre/clean-nextloud-s3/issues/2
Also scanning 300k objects takes so much time and downloading them costs €€ ;)
I can still reproduce this in Nextcloud 29.
Pre upload:
$ du -sm ./minio/data/
6897 ./minio/data/
Upload ~5GB file, abort at ~50%, find:
$ du -sm ./minio/data/
9541 ./minio/data/
@tsohst sorry, I don't know. I stopped using nextcloud because of this error years ago.
Disclaimer: Work in progress.
Based on my review of this thread it doesn't appear everyone has the same underlying cause (though the symptoms are somewhat similar). Since this issue has fairly broad title, it's likely there is also some overlap with other open Issues (I'll try to review these as time permits and sort some of these out).
Here are the apparent underlying causes I've been been able to identify from this Issue:
Locking (and legal holds) got mentioned, but hasn't seemed to be a factor with anyone here.
I'll also toss a couple others into the list:
groupfolders
or encryption
(?)Some of these (but not all) could be addressed through some documentation tweaks.
Keep in mind this is work in progress analysis. Here are a couple notes on a couple of the biggies above.
Different providers and object store platforms have different defaults. For example, Backblaze has versioning on by default. AWS has it off by default, but when it's turned on versioning of individual objects apparently are hidden by default in their Web UI in some places so it can be easily to miss if they've been turned on through org policy.
Solution: Either turn off versioning or add lifecycle management rules on your S3 platform. Also, the files_versions_s3
app may be of interest: https://github.com/nextcloud/files_versions_s3/
Maybe we can do better here, but it's going to take some work to figure that out. On the other hand, lifecycle rules can be made to handle this situation well (and cleanly) from the looks of it.
How to use GitHub
Steps to reproduce
Expected behaviour
Trashbin should be empty correctly
Actual behaviour
After some time, an error appears (Error while empty trash). Reloading the page shows no more files either in the Files or Trashbin.
But files still on the Object Storage, here OBJECTS and SIZE:
Before these test operations (upload, delete...)
Following commands had been executed (after):
One user has reported that interface show he is "using" 1,9Gb of storage, but he has NO FILES or FOLDERS at all, either in FILES or TRASHBIN in a production instance.
Server configuration
Operating system: Ubuntu 18.04
Web server: Nginx 17
Database: MariaDB 10.4
PHP version: 7.3
Nextcloud version: (see Nextcloud admin page) 18.0.3
Updated from an older Nextcloud/ownCloud or fresh install: Fresh install
Where did you install Nextcloud from: Official sources
Signing status:
Signing status
``` No errors have been found. ```List of activated apps:
App list
``` Enabled: - accessibility: 1.4.0 - admin_audit: 1.8.0 - announcementcenter: 3.7.0 - apporder: 0.9.0 - cloud_federation_api: 1.1.0 - dav: 1.14.0 - external: 3.5.0 - federatedfilesharing: 1.8.0 - files: 1.13.1 - files_accesscontrol: 1.8.1 - files_automatedtagging: 1.8.2 - files_pdfviewer: 1.7.0 - files_rightclick: 0.15.2 - files_sharing: 1.10.1 - files_trashbin: 1.8.0 - files_versions: 1.11.0 - files_videoplayer: 1.7.0 - groupfolders: 6.0.3 - impersonate: 1.5.0 - logreader: 2.3.0 - lookup_server_connector: 1.6.0 - notifications: 2.6.0 - oauth2: 1.6.0 - password_policy: 1.8.0 - privacy: 1.2.0 - provisioning_api: 1.8.0 - settings: 1.0.0 - sharebymail: 1.8.0 - theming: 1.9.0 - theming_customcss: 1.5.0 - twofactor_backupcodes: 1.7.0 - viewer: 1.2.0 - workflow_script: 1.3.1 - workflowengine: 2.0.0 ```Nextcloud configuration:
Config report
``` { "system": { "objectstore": { "class": "\\OC\\Files\\ObjectStore\\S3", "arguments": { "bucket": "testing.example.com", "autocreate": true, "key": "***REMOVED SENSITIVE VALUE***", "secret": "***REMOVED SENSITIVE VALUE***", "hostname": "10.1.0.2", "port": 8080, "use_ssl": false, "region": "fr-par", "use_path_style": true } }, "log_type": "file", "logfile": "\/var\/log\/nextcloud\/testing.example.com-nextcloud.log", "passwordsalt": "***REMOVED SENSITIVE VALUE***", "secret": "***REMOVED SENSITIVE VALUE***", "trusted_domains": [ "testing.example.com" ], "datadirectory": "***REMOVED SENSITIVE VALUE***", "dbtype": "mysql", "version": "18.0.3.0", "overwrite.cli.url": "https:\/\/testing.example.com", "dbname": "***REMOVED SENSITIVE VALUE***", "dbhost": "***REMOVED SENSITIVE VALUE***", "dbport": "3306", "dbtableprefix": "oc_", "mysql.utf8mb4": true, "dbuser": "***REMOVED SENSITIVE VALUE***", "dbpassword": "***REMOVED SENSITIVE VALUE***", "dbdriveroptions": { "1009": "\/etc\/ssl\/mysql\/ca-cert.pem", "1008": "\/etc\/ssl\/mysql\/client-cert.pem", "1007": "\/etc\/ssl\/mysql\/client-key.pem", "1014": false }, "installed": true, "skeletondirectory": "", "default_language": "fr", "default_locale": "fr_FR", "activity_expire_days": 30, "auth.bruteforce.protection.enabled": false, "blacklisted_files": [ ".htaccess", "Thumbs.db", "thumbs.db" ], "htaccess.RewriteBase": "\/", "integrity.check.disabled": false, "knowledgebaseenabled": false, "logtimezone": "Europe\/Paris", "maintenance": false, "memcache.local": "\\OC\\Memcache\\APCu", "memcache.distributed": "\\OC\\Memcache\\Redis", "updatechecker": false, "appstoreenabled": false, "upgrade.disable-web": true, "filelocking.enabled": false, "overwriteprotocol": "https", "preview_max_scale_factor": 1, "redis": { "host": "***REMOVED SENSITIVE VALUE***", "port": 6379, "timeout": 2.5, "dbindex": 2, "password": "***REMOVED SENSITIVE VALUE***" }, "quota_include_external_storage": false, "theme": "", "trashbin_retention_obligation": "auto, 7", "updater.release.channel": "stable", "mail_smtpmode": "smtp", "mail_smtpsecure": "tls", "mail_sendmailmode": "smtp", "mail_from_address": "***REMOVED SENSITIVE VALUE***", "mail_domain": "***REMOVED SENSITIVE VALUE***", "mail_smtpauth": 1, "mail_smtphost": "***REMOVED SENSITIVE VALUE***", "mail_smtpport": "587", "mail_smtpname": "***REMOVED SENSITIVE VALUE***", "mail_smtppassword": "***REMOVED SENSITIVE VALUE***", "instanceid": "***REMOVED SENSITIVE VALUE***", "overwritehost": "testing.example.com", "preview_max_x": "1280", "preview_max_y": "800", "jpeg_quality": "70", "loglevel": 2, "enabledPreviewProviders": [ "OC\\Preview\\PNG", "OC\\Preview\\JPEG", "OC\\Preview\\GIF", "OC\\Preview\\BMP", "OC\\Preview\\XBitmap" ], "apps_paths": [ { "path": "\/var\/www\/apps", "url": "\/apps", "writable": false }, { "path": "\/var\/www\/custom", "url": "\/custom_apps", "writable": true } ] } } ```Logs are completely empty (we have just fired up a test instance, and test this use case).
Similar to https://github.com/nextcloud/server/issues/17744