Question about minimizing memory usage

twitchyliquid64 commented 10 years ago

Hi there,

The project I am currently working on involves reading an openstreetmaps file and inserting it into a database. As the objects are read from XML, they are sent to one of three workers via a channel - one worker for each type of object. Each worker does a commit every 1000 objects. ~~As such, there are three open transactions at most times.~~ Seems to be that you cant have simultaneous transactions, ill have to put everything in one TX.

On my dev machine, I have 16gb of ram, so Ive never had an issue. QL seems to do EVERYTHING in ram until I call DB.Close(), then it spends 10 seconds writing it to the drive.

I was wondering how this behaviour changes on a machine with limited memory. When the buffer fills up, with QL write to disk at the tail end of each commit? Or will the process simply run out of memory and crash?

twitchyliquid64 commented 10 years ago

I'm doing a little more reading into this, and from what I can gather this depends on the closing of the 'write collecting window'. What conditions trigger the closure of a write collecting window?

cznic commented 10 years ago

Doing bulk (say 100 to 1000 rows) updates is a good idea. If it's still a good idea even when done concurrently in different goroutines is another question.

Transactions are isolated (I in ACID). The current implementation isolates transactions by serialization. When a tool converts some data to a DB that might be fine or not. Let's construct a worst case: There are different types of data going to different tables. One type appears only once as the last thing of the input data. If the goroutine serving this table initially happens to win the competition for being the first transaction serialized then bad things can happen, including a deadlock - if the other respective table channels become full and thus blocking the producer/type dispatcher mux.

About the collecting window

A COMMIT, iff at top level TNL (: Transaction Nesting Level, TLC: Top Level COMMIT) enables persisting DB mutations (held completely in RAM before) to the DB file. The collecting window is opened when DB mutation starts. The window ends when forced (Flush, Close) or when timeouted (1 second) or on TLC, whatever of the two last options happens later.

This describes the FSM:

+------------+-----------------+---------------+-----------------+
|\  Event    |                 |               |                 |
| \--------\ |    BEGIN        |    COMMIT     |     timeout     |
|   State   \|                 |               |                 |
+------------+-----------------+---------------+-----------------+
| idle       | BeginUpdate     | panic         | panic           |
|            | TNL  = 1        |               |                 |
|            | start timer     |               |                 |
|            | S = collecting  |               |                 |
+------------+-----------------+---------------+-----------------+
| collecting | TNL++           | TNL--         | S = collecting- |
|            |                 | if TNL == 0   |     triggered   |
|            |                 |     S = idle- |                 |
|            |                 |         armed |                 |
+------------+-----------------+---------------+-----------------+
| idle-      | TNL = 1         | panic         | EndUpdate       |
| aremd      | S = collecting- |               | S = idle        |
|            |     armed       |               |                 |
+------------+-----------------+---------------+-----------------+
| collecting-| TNL++           | TNL--         | S = collecting- |
| armed      |                 | if TNL == 0   |     triggered   |
|            |                 |     S = idle- |                 |
|            |                 |         armed |                 |
+------------+-----------------+---------------+-----------------+
| collecting-| TNL++           | TNL--         | panic           |
| triggered  |                 | if TNL == 0   |                 |
|            |                 |     EndUpdate |                 |
|            |                 |     S = idle  |                 |
+------------+-----------------+---------------+-----------------+

If you need to fill different tables concurrently then you can run into deadlock if you try to optimize writes using bulk updates (if the sources are finite sized channels). So:

Update a row at a time. Might be too slow, though.
Preprocess the XML input producing a flat data file (fast) per type firstly. In a second pass, put every respective file into the table it belongs using bulk updates, serially, one file at a time.

PS: Transactions are resource (RAM) limited but it is intentionally not documented as the plan was always to turn this limit to available disk space (WIP). However, updating zillions of rows within a single transaction is rather problematic with any DB. Even when the pending mutations overflow to a disk there's still a problem with consistency (C in ACID) because all the other transactions must wait or restart or collapse depending on what happens on the TLC of the long running one - if they're using that transaction data before or during the mutation. Somehow related (and WIP): http://en.wikipedia.org/wiki/Multiversion_concurrency_control

twitchyliquid64 commented 10 years ago

Ok ive had the issue your talking about with the deadlock. I guess the easiest way to deal with that is just to dispatch inserts for all three types into one go-routine.

db_worker_goroutine(type1 chan, type2 chan, type3 chan){ select { ....

I still don't follow on what you mean about memory usage. If I do:

count := 0 for obj := range channelOfData { count++ if (count%1000)==0{ //commit the current transaction with commit() and recreate it with begin() } tx.Exec("INSERT INTO blah blah", obj) }

Or in other words, I have a single transaction at a time, and no more than 1000 objects are inserted within a single transaction. I'm not sure of the effect of bulk updating different tables in one transaction but I hope its okay.

If I do this, will QL keep it's memory in check?

The window ends when forced (Flush, Close) or when timeouted (1 second) or on TLC, whatever of the two last options happens later.

Does that mean that if I close all open transactions then time.Sleep(2 * time.Second) it will force it to disk?

cznic commented 10 years ago

Ok ive had the issue your talking about with the deadlock. I guess the easiest way to deal with that is just to dispatch inserts for all three types into one go-routine.
db_worker_goroutine(type1 chan, type2 chan, type3 chan) {
        select {
        ....

Looks good to me.

I still don't follow on what you mean about memory usage. If I do:
count := 0
for obj := range channelOfData {
        count++
        if (count%1000)==0{
                //commit the current transaction with commit() and recreate it with begin()
        }
        tx.Exec("INSERT INTO blah blah", obj)
}
Or in other words, I have a single transaction at a time, and no more than 1000 objects are inserted within a single transaction. I'm not sure of the effect of bulk updating different tables in one transaction but I hope its okay.

This should work just fine if there's no forgotten outer transaction or no other long lasting "concurrent" transaction.

Note I have meant a different version of 'bulk update':

BEGIN TRANSACTION;
        INSERT INTO t VALUES
                ($1),
                ($2),
                ...,
                ($100),
        ;
COMMIT;

However, your code is okay, only a bit slower as executed per single rows.

If I do this, will QL keep it's memory in check?

AFAICT, it should. The RAM used is what's needed to hold the updates for the 1000 rows, regardless of them being from different tables.

The window ends when forced (Flush, Close) or when timeouted (1 second) or on TLC, whatever of the two last options happens later. Does that mean that if I close all open transactions then time.Sleep(2 * time.Second) it will force it to disk?

Yes, sometimes ;-)

The window ends on the timeout or TLC, whatever happens later. In effect: If more than one short, sub second transactions TLC commits within a second or sooner they are all start to be written to the disk together at the end of the one second window (improves performance in several orders) - because the timeout happened later. When any single open transaction lasts for more than one second before it is commited then it starts to be written to the disk immediately - because the TLC happened later. IOW, the TLC commited data are never held in RAM for more than one second before they start to be written out. So yes, idling for more than one second starts writing (2PC/WAL) of TLC commited data. But that may took some time T. So idling for 2 seconds makes your data sit on the disk if T < 1. (For 1000 non blob rows T might be < 1 second easily.)

twitchyliquid64 commented 10 years ago

Awesome. Just coded that, testing it now.

1000-5000ish works great per commit. Hell, when I was developing testing the thing I did >10 mil without any problems but this is more responsive.

I'm doing time.Sleep() every 15k rows, which makes the data show up on disk. However, the memory usage keeps rising. Is that expected? (maybe the OS (windows for dev) isnt freeing the buffers).

cznic commented 10 years ago

Go memory handling (allocator and garbage collector) is pretty complex. Some things happen only once in longer time intervals (a minute or minutes) - like IIRC returning unused RAM to the OS, and, IIRC again, Go on Windows doesn't do that at all.

Otherwise, as usual, do the mem profiling to be sure you don't leak some objects by having them reachable even when not needed anymore - which is a common situation because it's rather easy to overlook such things in non trivial programs.

PS: Once your code works properly, I suggest to remove the time.Sleep at all. If you're doing TLC commits per few {hundred,thousands} rows, the timeout/TLC state machine should do the job for you anyway. Just don't forget to .Close the DB before exiting.

twitchyliquid64 commented 10 years ago

Ok I implemented the changes you suggested and tested on a low memory system. I'm afraid ql isnt performing as expected - memory usage keeps growing and growing till application crash.

What I did:

1000 rows inserted per transaction.
Every 10,000 rows inserted time.Sleep(2 * time.Second) is called.
Memory profiling stops after the 99,000 row.

Profiling returned:

(pprof) top10 Total: 22.0 MB 9.0 40.9% 40.9% 9.0 40.9% github.com/cznic/exp/lldb.(_bitFiler).ReadAt 6.5 29.5% 70.5% 10.0 45.5% github.com/cznic/exp/lldb.(_bitFiler).WriteAt 3.0 13.6% 84.1% 3.0 13.6% github.com/cznic/exp/lldb.(_cache).get2 1.5 6.8% 90.9% 2.0 9.1% main.process 0.5 2.3% 93.2% 0.5 2.3% github.com/cznic/bufs.(_Cache).get 0.5 2.3% 95.5% 0.5 2.3% github.com/cznic/exp/lldb.(*acidWriter0).WriteAt 0.5 2.3% 97.7% 2.5 11.4% main.main 0.5 2.3% 100.0% 0.5 2.3% reflect.unsafe_NewArray

ReadAt():

(pprof) list ReadAt Total: 22.0 MB ROUTINE ====================== github.com/cznic/exp/lldb.(_RollbackFiler).ReadAt in /root/osm/go-osm-parse/src/github.com/cznic/exp/lldb/xact.go 0.0 5.5 Total MB (flat / cumulative) . . 537: . . 538: return r.bitFiler.PunchHole(off, size) . . 539: } . . 540: . . 542: func (r RollbackFiler) ReadAt(b []byte, off int64) (n int, err error) { . . 543: r.inCallbackMu.RLock() . . 544: defer r.inCallbackMu.RUnlock() . . 545: if !r.inCallback { . . 546: r.mu.RLock() . . 547: defer r.mu.RUnlock() . . 548: } . . 549: if r.tlevel == 0 { . . 550: return r.f.ReadAt(b, off) . . 551: } . . 552: . 5.5 553: return r.bitFiler.ReadAt(b, off) . . 554: } . . 555: . . 557: func (r RollbackFiler) Rollback() (err error) { . . 558: r.mu.Lock() . . 559: defer r.mu.Unlock() ROUTINE ====================== github.com/cznic/exp/lldb.(bitFiler).ReadAt in /root/osm/go-osm-parse/src/github.com/cznic/exp/lldb/xact.go 9.0 14.5 Total MB (flat / cumulative) . . 136: f.m[pgI] = pg . . 137: } . . 138: return . . 139: } . . 140: . . 141: func (f *bitFiler) ReadAt(b []byte, off int64) (n int, err error) { . . 142: avail := f.size - off . . 143: pgI := off >> bfBits . . 144: pgO := int(off & bfMask) . . 145: rem := len(b) . . 146: if int64(rem) >= avail { . . 147: rem = int(avail) . . 148: err = io.EOF . . 149: } . . 150: for rem != 0 && avail > 0 { . . 151: pg := f.m[pgI] . . 152: if pg == nil { 8.5 8.5 153: pg = &bitPage{} . . 154: if f.parent != nil { . 5.5 155: , err = f.parent.ReadAt(pg.data[:], off&^bfMask) . . 156: if err != nil && !fileutil.IsEOF(err) { . . 157: return . . 158: } . . 159: . . 160: err = nil . . 161: } 0.5 0.5 162: f.m[pgI] = pg . . 163: } . . 164: nc := copy(b[:mathutil.Min(rem, bfSize)], pg.data[pgO:]) . . 165: pgI++ . . 166: pgO = 0 . . 167: rem -= nc . . 168: n += nc . . 169: b = b[nc:] . . 170: off += int64(nc) . . 171: } . . 172: return . . 173: } . . 174: . . 175: func (f *bitFiler) Truncate(size int64) (err error) { . . 176: switch { . . 177: case size < 0:

Looks like the big one is pg = &bitPage{}. Is this something for you to look into or have I stuffed up?

cznic commented 10 years ago

I can attempt to use sync.Pool for bitPage pooling, if possible. However, that would make QL (transitively) dependent on Go 1.3. Is that okay for you?

twitchyliquid64 commented 10 years ago

I don't understand. What's the problem? On Sep 12, 2014 12:12 AM, "cznic" notifications@github.com wrote:

I can attempt to use sync.Pool for bitPage pooling, if possible. However, that would make QL (transitively) dependent on Go 1.3. Is that okay for you?

— Reply to this email directly or view it on GitHub https://github.com/cznic/ql/issues/71#issuecomment-55269130.

cznic commented 10 years ago

I thought you're using Go 1.2.

twitchyliquid64 commented 10 years ago

I can use 1.3, I'm saying I do not understand the cause of the memory problem. On Sep 12, 2014 12:44 AM, "cznic" notifications@github.com wrote:

I thought you're using Go 1.2 https://github.com/cznic/fileutil/issues/11.

— Reply to this email directly or view it on GitHub https://github.com/cznic/ql/issues/71#issuecomment-55274038.

cznic commented 10 years ago

I don't see a problem, the memory usage seems just fine. However, I see an opportunity to make it lower by recycling the bitPage objects as they're consuming nearly half of that memory.

twitchyliquid64 commented 10 years ago

Why is the memory continually growing? I was under the assumption that once ql could write an amount of data this way without memory getting out of control. On Sep 12, 2014 12:52 AM, "cznic" notifications@github.com wrote:

I don't see a problem, the memory usage seems just fine. However, I see an opportunity to make it lower by recycling the bitPage objects as they're consuming nearly half of that memory.

— Reply to this email directly or view it on GitHub https://github.com/cznic/ql/issues/71#issuecomment-55275221.

cznic commented 10 years ago

The measurement doesn't show any excessive grow - if I understood it correctly. 22M is pretty moderate memory usage (buffers, caches, ...) when 100k rows have been processed. If there is a memory use grow, please try to profile it when it's growing and it grown up to some much bigger size. Then we can hopefully see the problem.

twitchyliquid64 commented 10 years ago

Okay, I have done a second profile at the point just before the process crashes (890,000k rows * number of tags) as it runs out of memory. The same lines of code are at fault here. This is suggesting to me very strongly that there is a memory leak somewhere.

(pprof) top10 Total: 54.6 MB 34.4 62.9% 62.9% 34.4 62.9% github.com/cznic/exp/lldb.(_bitFiler).ReadAt 15.5 28.4% 91.3% 15.5 28.4% github.com/cznic/exp/lldb.(_cache).get2 1.7 3.2% 94.5% 1.7 3.2% github.com/cznic/exp/lldb.(_acidWriter0).WriteAt 1.0 1.8% 96.3% 6.0 11.0% github.com/cznic/exp/lldb.(_bitFiler).WriteAt 1.0 1.8% 98.2% 1.0 1.8% main.process 0.5 0.9% 99.1% 0.5 0.9% github.com/cznic/bufs.(_Cache).get 0.5 0.9% 100.0% 1.5 2.7% main.main 0.0 0.0% 100.0% 0.5 0.9% database/sql.(_Tx).Commit

Same deal with the ReadAt function.

(pprof) list ReadAt Total: 54.6 MB ROUTINE ====================== github.com/cznic/exp/lldb.(_RollbackFiler).ReadAt in /root/osm/go-osm-parse/src/github.com/cznic/exp/lldb/xact.go 0.0 29.4 Total MB (flat / cumulative) . . 537: . . 538: return r.bitFiler.PunchHole(off, size) . . 539: } . . 540: . . 541: // Implements Filer. . . 542: func (r RollbackFiler) ReadAt(b []byte, off int64) (n int, err error) { . . 543: r.inCallbackMu.RLock() . . 544: defer r.inCallbackMu.RUnlock() . . 545: if !r.inCallback { . . 546: r.mu.RLock() . . 547: defer r.mu.RUnlock() . . 548: } . . 549: if r.tlevel == 0 { . . 550: return r.f.ReadAt(b, off) . . 551: } . . 552: . 29.4 553: return r.bitFiler.ReadAt(b, off) . . 554: } . . 555: . . 556: // Implements Filer. . . 557: func (r RollbackFiler) Rollback() (err error) { . . 558: r.mu.Lock() . . 559: defer r.mu.Unlock() ROUTINE ====================== github.com/cznic/exp/lldb.(bitFiler).ReadAt in /root/osm/go-osm-parse/src/github.com/cznic/exp/lldb/xact.go 34.4 63.8 Total MB (flat / cumulative) . . 136: f.m[pgI] = pg . . 137: } . . 138: return . . 139: } . . 140: . . 141: func (f *bitFiler) ReadAt(b []byte, off int64) (n int, err error) { . . 142: avail := f.size - off . . 143: pgI := off >> bfBits . . 144: pgO := int(off & bfMask) . . 145: rem := len(b) . . 146: if int64(rem) >= avail { . . 147: rem = int(avail) . . 148: err = io.EOF . . 149: } . . 150: for rem != 0 && avail > 0 { . . 151: pg := f.m[pgI] . . 152: if pg == nil { 31.0 31.0 153: pg = &bitPage{} . . 154: if f.parent != nil { . 29.4 155: , err = f.parent.ReadAt(pg.data[:], off&^bfMask) . . 156: if err != nil && !fileutil.IsEOF(err) { . . 157: return . . 158: } . . 159: . . 160: err = nil . . 161: } 3.4 3.4 162: f.m[pgI] = pg . . 163: } . . 164: nc := copy(b[:mathutil.Min(rem, bfSize)], pg.data[pgO:]) . . 165: pgI++ . . 166: pgO = 0 . . 167: rem -= nc . . 168: n += nc . . 169: b = b[nc:] . . 170: off += int64(nc) . . 171: } . . 172: return . . 173: } . . 174: . . 175: func (f *bitFiler) Truncate(size int64) (err error) { . . 176: switch { . . 177: case size < 0:

Here is the profile of the second function for completeness:

(pprof) list get2 Total: 54.6 MB ROUTINE ====================== github.com/cznic/exp/lldb.(cache).get2 in /root/osm/go-osm-parse/src/github.com/cznic/exp/lldb/falloc.go 15.5 15.5 Total MB (flat / cumulative) . . 1799: func (c cache) get(n int) node { . . 1800: r, := c.get2(n) . . 1801: return r . . 1802: } . . 1803: . . 1804: func (c cache) get2(n int) (r node, isZeroed bool) { . . 1805: s := c . . 1806: lens := len(s) . . 1807: if lens == 0 { 14.0 14.0 1808: return &node{b: make([]byte, n, mathutil.Min(2_n, maxBuf))}, true . . 1809: } . . 1810: . . 1811: i := sort.Search(lens, func(x int) bool { return len(s[x].b) >= n }) . . 1812: if i == lens { . . 1813: i-- 1.5 1.5 1814: s[i].b, isZeroed = make([]byte, n, mathutil.Min(2n, maxBuf)), true . . 1815: } . . 1816: . . 1817: r = s[i] . . 1818: r.b = r.b[:n] . . 1819: copy(s[i:], s[i+1:]) . . 1820: s = s[:lens-1] . . 1821: c = s . . 1822: return . . 1823: } . . 1824: . . 1825: func (c cache) cget(n int) (r *node) { . . 1826: r, ok := c.get2(n) . . 1827: if ok {

twitchyliquid64 commented 10 years ago

Wait, did I just read that: Total: 54.6 MB

Why am I running out of memory then ...

twitchyliquid64 commented 10 years ago

Okay if I run runtime.GC() every 10,000 rows it seems to fix memory issues.

Hmmmmm....

Apologies for the goose chase.

twitchyliquid64 commented 10 years ago

Can confirm that this method worked on a VPS with 512MB ram, inserting close to 3 million rows :)

Thanks for your help. Gonna close.

cznic commented 10 years ago

Glad to hear it works for you ;-)

cznic commented 10 years ago

FYI: This might be related. (Fixed only at tip.)

cznic / ql

Question about minimizing memory usage #71