Open AshRaghav opened 8 years ago
@ebrehault @jean - any ideas on what could be the problem here?
Is there a way to find out what the highest blobstorage folders are pointing to in the site? Massive size is affecting backups schedules and causing some major concerns. Any direction would be appreciated.
I forgot to mention that we are also using replication between two databases, so not sure if that is causing some issue.
Thanks
Do you have File Attachements fields in your Plomino db ? If you don't, your problem is not related to Plomino. If you do, try to export your db docs as XML on the server, and then see how big it is.
Note: more generally, Plone might have a big blobstorage if:
Thanks @ebrehault ! That was a brilliant insight.
I have managed to export the XML and figured out that 5MB PDF file (downloaded to my desktop) uploaded to our website has bloated into 950MB XML file on the server. Taking a closer look reveals that the PDF file had 6 pages of screenshots of some images uploaded as an evidence to us. The same goes with 12MB word document (downloaded to my desktop) with screenshots/photographs of some evidences is showing up 1.2GB XML file.
I understand that it is a binary storage but - is this the way that blobs are always stored or are we doing something horribly wrong? Because what should've been an ideal storage size of around 5-7GB now looks to be 45GB on the server.
Regarding your suggestions
Secondly, there is problem with exporting certain plomino documents where I get this error
Traceback (innermost last):
Module ZPublisher.Publish, line 138, in publish Module ZPublisher.mapply, line 77, in mapply Module ZPublisher.Publish, line 48, in call_object Module Products.CMFPlomino.PlominoReplicationManager, line 1241, in manage_exportAsXML Module Products.CMFPlomino.PlominoReplicationManager, line 1296, in exportAsXML Module Products.CMFPlomino.PlominoReplicationManager, line 1319, in exportDocumentAsXML Module xmlrpclib, line 1085, in dumps Module xmlrpclib, line 632, in dumps Module xmlrpclib, line 654, in __dump Module xmlrpclib, line 735, in dump_struct Module xmlrpclib, line 652, in __dump TypeError: cannot marshal <class 'stripe.resource.Charge'> objects
Not sure how to get rid of this error as it comes from a python package - "stripe-1.25.0-py2.7.egg"
Thanks again
That's because one of the items of the document is not serialisable. Regular Plomino items are supposed to be serialisable so I guess it is an item created programmatically by one of your formulas and it is not a simple type (like a string, integer, date, array, dict...).
Thanks @ebrehault. That is less of a worry considering what is happening with the blob storage.
Do you know if anyone else faced a similar issue where files containing screenshots or images have different sizes in blob storage and windows file system?
If I am able to fix the blob storage, then I might not have to worry too much with exporting the documents.
No, never heard of it before.
No problem @ebrehault. I figured out the problem with blob storage bloating eventually.
Turns out that when we were replicating the data between two Plomino databases (not using replication tab), each attachment was being recreated in destination database on a daily basis along with updated data from source, thus causing the bloat over a period in time.
I have cleared out all the duplicates and also looking at other options to clear the views once they are processed to the destination database. I am not sure why replication is not being used in our case but I am hoping there was a certain reason around it.
What was previously a blobstorage of size 45GB is now 12GB.
Thanks for your help and support about exporting documents without which I would've not figured this out.
Hi,
Sorry if this appears to be a vague question - has anyone ever experienced the blobstorage growing exponentially?
We seem to have a mysterious and tumorous growth where it currently stands at 40GB. We are unsure on how to profile/analyse this blobstorage to figure out the possible reasons for this unsustainable size.
Can someone kindly recommend any profiling tools for blobstorage please? Is there any configuration that tells us what sort of elements are stored in the blobstorage?
Our environment is:- Plone 4.3.3 (4308) CMF 2.2.7 Zope 2.13.22 Python 2.7.6 (default, Jun 2 2016, 08:43:38) [GCC 4.8.4] PIL 2.3.0 (Pillow)
We have roughly around 2000 users. About 50 Forms and a similar number of Views. The audit trail to collect "Save" and "Delete" actions for documents is on.
Not sure if this helps but the ZMI Index page shows around 13500 records. We do store documents but they are considerably less in size (probably about 2-3GB) as they are either PDF, Word documents or JPEGs not exceeding 5MB. The Data.fs is currently showing around 3.5GB in size.
I am hoping that the above information will be of some help but if I missed any useful information then please let me know.
Thanks