cubewise-code / tm1py

TM1py is a Python package that wraps the TM1 REST API in a simple to use library.
http://tm1py.readthedocs.io/en/latest/
MIT License
190 stars 109 forks source link

Commit data ingestion #200

Closed kaleming closed 4 years ago

kaleming commented 4 years ago

Hi Marius,

Sometimes, after I load some data into cubes, I noticed computer memory usage increases a lot.

After read this postgresql article , I found this could be related:

I recently helped a client with serious PostgreSQL problems. Their database is part of an application that configures and monitors large networks; each individual network node reports its status to the database every five minutes. The client was having problems with one table in particular, containing about 25,000 rows — quite small, by modern database standards, and thus unlikely to cause problems.

However, things weren’t so simple: This table was growing to more than 20 GB every week, even though autovacuum was running. In response, the company established a weekly ritual: Shut down the application, run a VACUUM FULL, and then restart things. This solved the problem in the short term — but by the end of the week, the database had returned to be about 20 GB in size. Clearly, something needed to be done about it.

I’m happy to say that I was able to fix this problem, and that the fix wasn’t so difficult. Indeed, the source of the problem might well be obvious to someone with a great deal of PostgreSQL experience. That’s because the problem mostly stemmed from a failure to understand how PostgreSQL’s transactions work, and how they can influence the functioning of VACUUM, the size of your database, and the reliability of your system. I’ve found that this topic is confusing to many people who are new to PostgreSQL, and I thus thought that it would be a useful to walk others through the problem, and then describe how we solved it.

Is it necessary to do some extra commit after using tm1.cubes.cells.write_values_through_cellset ?

MariusWirtz commented 4 years ago

Hi @kaleming,

No special kind of commit is required.

I have not seen this behaviour. When using the write_values_through_cellset method, TM1py does create cellsets in TM1 and they do require some additional RAM. However TM1py also removes them and either way all your cellsets are destroyed after you explicitly logout from the instance.

Perhaps you can play with the technical dimension order of the cube, to lower the overall memory footprint of the cube.