scheme shell hangs while saving atomspace with (sql-store)

mjsduncan commented 7 years ago

i was actually re-saving an entire 4M atom atomspace, which is clearly a bad idea, but it was still stuck after a period several times greater than it took to initially save, and turning off the pgsql server didn't help, so i had to kill the guile binary. tangentially, on a different occasion i tried invoking delete-recursive and got a message saying it wasn't implemented andthe guile binary crashed. i'm making a point of doing my work without the cogserver, but in it's favor it could handle these kinds of problems without borking the atomspace. is there any way the to make the scheme binding to pgsql more robust?

mjsduncan commented 7 years ago

the guile binary crashes after an invocation of cog-delete also (or maybe this is a repeat of the above tangent and i mis-remembered using delete-recursive):

> (for-each cog-delete (cog-filter 'MemberLink (cog-get-trunk (ConceptNode "GO pathway"))))
terminate called after throwing an instance of 'opencog::RuntimeException'
  what():  Not implemented!!! (/home/mjsd/oc/atomspace/opencog/atomspace/AtomSpace.cc:407)

linas commented 7 years ago

I save atomspaces with 60M or more atoms weekly. But I never do batch-saves, I do one-atom-at-at-atime saves, whenever that atom is ready (and needs saving) Its sort of less painful to do saves on-the-go. The fastest I've ever seen is about 7K-8K atoms/sec, and sometimes settling down to 2K or 3K atoms/sec. so big databases can take hours. A 4M atom dataset should take less than half an hour.

If its taking longer than that, then make sure you follow the postgres tuning/setup instructions in the README. Postgres will run 100x slower if you don't disable its various safety features. SSD disks are faster than spinning disks, by a LOT. Use SATA attached internal hard drives; saving to USB external hard drives will be slowww.

linas commented 7 years ago

cog-extract removes an atom from the atomspace. By contrast, cog-delete removes an atom from the atomspace and also from the SQL database. If there is no open SQL DB, then delete and extract do the same thing.

Deleting from SQL is not implemented, because I never needed it and never got around to it. Its not hard... a couple of days work, maybe. In the 10 years we've had this, you're the third person to trip over this, and the first two did not need to actually delete, they only needed to extract.

linas commented 7 years ago

anyway, batch-saves should work without hanging. If you can reproduce, explain how. If its hard to reproduce: ... make sure you have enough disk space. Make sure you are not running out of RAM. (use top to check)

If you run out of RAM, then the operating system will swap and thrash, and everything will get really really slow. If you also run out of swap space, then the OS OOM-killer (out-of-memory killer) will kill the fattest, juiciest target it can find, as long as its not X11.

linas commented 7 years ago

A 4M dataset can take longer than what I said, if the atoms are festooned with lots of truth values and other values.

mjsduncan commented 7 years ago

it's not a memory or disk space problem. i've saved this atomspace multiple times and half an hour sounds about right. i'm not dealing with truth values yet ;P the time i'm talking about i tried to re-save a copy of the entire atomspace on top of the original while testing a script and after a couple hours i shut down the the postgres server hoping to get the scheme interpreter to give up and give me a prompt. if you think it's worth trying again, what should i do to get diagnostic info since it doesn't crash? as far as deleting from sql, i'm experimenting with different versions of large sets containing lots of complex hypergraphs (the dreaded pathways) so being able to replace subsets of hypergraphs in sql without regenerating and resaving the entire bio-atomspace would be very useful in the short-to-medium-term when it's going to get a lot bigger than it is now as a toy/demo setup...

linas commented 7 years ago

ugh. There are no particular diagnostic tools for this, because hangs are not supposed to happen. If it happened to me, I know what I'd do -- I'd blow it off for as long as I can until I get annoyed enough to deal with it; and then devote the day or two needed. Its a lot easier to debug when the bug is on my system and I know what my data is, and what its supposed to be doing. In your case.. I dunno. Did cpu usage drop to zero? Did it peg at 100%? if you break in with gdb ... ugh. that is all hard.

For cog-delete, its issue #255 someone needs to sit down and do it. Its not too hard if you know sql and c++ otherwise its hard.

mjsduncan commented 7 years ago

then it won't be me at least this year. at the hang all the cores were 0% - < 5% . if i try to reproduce it again what's the best way to save my list of pathway graphs that were expensive to build and too big to save in sql? dump them as text? is there a way to save and restore the repl state?

linas commented 7 years ago

I'd convert the pathway graphs into something manageable, e.g. memberlinks or subset links, or the still-debated coloring/partition links. Surely that is easy to do. Any link that is large is just going to cause issues, not just for sql, but for pln, the pattern matcher, the pattern miner, etc.

This is not your fault; I am only now realizing that SetLink itself is a design flaw. I will have to redesign BindLink, GetLink, etc,. so that they stop using SetLink, but this will be a lot of work.

You can print the contents of the atomspace, as scheme, and then suck that back in, as scheme. That's not something I ever do, but it "should work". The print functions are designed so that what they print is valid scheme, precisely in order to allow cut-n-paste, etc.

bgoertzel commented 7 years ago

Mike, I think the best thing is for you to export the Atoms formed by your scripts as a big Scheme file that just contains the Atoms already formed. Then you can just load the Atoms from this Scheme file without doing the algorithmic work to re-create them. Amen can help you with this, or maybe Hedra or Tensae....

linas commented 7 years ago

Ben, long term, we have to fix SetLink to disalllow the pathological usages. We also need some appropriate variant of the color/partition thing. I strongly prefer color over partition, because partition becomes the law of excluded middle over sets of two items, while color does not.

amebel commented 7 years ago

@mjsduncan https://github.com/opencog/atomspace/blob/master/opencog/scm/opencog/base/file-utils.scm#L135-L169

mjsduncan commented 7 years ago

thanks linus and amen!

linas commented 7 years ago

closing, I assume all issues have been resolved here.

opencog / atomspace

scheme shell hangs while saving atomspace with (sql-store) #1326