Closed ArtemGr closed 12 years ago
This is a bug that i am trying to chase down since several weeks and i thought i have fixed it in 2.0.3... :(
It means that the key in the btree index points to a record which does not exist. It's an internal error that should never happen.
Did you create this database with 2.0.3 or 2.0.2? Did you close the database and re-open it, or did your application stop (crash or terminated during debugging), and when you re-opened the database it performed a recovery operation? Anything else you can tell me how i can reproduce this?
Thanks Christoph
2012/7/9 Artem < reply@reply.github.com
I'm getting a "Data blob not found" error from the ham_find. Makes no sense to me. Database: http://glim.ru/personal/bugs/FUC.7z
Reply to this email directly or view it on GitHub: https://github.com/cruppstahl/hamsterdb/issues/8
On Mon, Jul 9, 2012 at 1:58 PM, Christoph Rupp < reply@reply.github.com
wrote:
This is a bug that i am trying to chase down since several weeks and i thought i have fixed it in 2.0.3... :(
It means that the key in the btree index points to a record which does not exist. It's an internal error that should never happen.
Did you create this database with 2.0.3 or 2.0.2?
With the master checkout (commit fe5e42df7beb1f37f45c23188d83f46d6409e03b)
Did you close the database and re-open it, or did your application stop (crash or terminated during debugging), and when you re-opened the database it performed a recovery operation?
Yes, it looks like there was a crash at the time. BTW, is there a way to detect recovery when the database is opened? I would log it then.
Anything else you can tell me how i can reproduce this?
I can show the code although I doubt it'll be of any help.
One question - i looked at the pastebin once more, and it says:
Are you really using keys with a size of 32896 bytes? If yes then i'll write a sample which uses a similar key. If not then the problem was caused somewhere else...
Thanks Christoph
2012/7/9 Artem < reply@reply.github.com
On Mon, Jul 9, 2012 at 1:58 PM, Christoph Rupp < reply@reply.github.com
wrote:
This is a bug that i am trying to chase down since several weeks and i thought i have fixed it in 2.0.3... :(
It means that the key in the btree index points to a record which does not exist. It's an internal error that should never happen.
Did you create this database with 2.0.3 or 2.0.2?
With the master checkout (commit fe5e42df7beb1f37f45c23188d83f46d6409e03b)
Did you close the database and re-open it, or did your application stop (crash or terminated during debugging), and when you re-opened the database it performed a recovery operation?
Yes, it looks like there was a crash at the time. BTW, is there a way to detect recovery when the database is opened? I would log it then.
Anything else you can tell me how i can reproduce this?
I can show the code although I doubt it'll be of any help.
Reply to this email directly or view it on GitHub: https://github.com/cruppstahl/hamsterdb/issues/8#issuecomment-6845322
Are you really using keys with a size of 32896 bytes? If yes then i'll write a sample which uses a similar key. If not then the problem was caused somewhere else...
No, I use keys 38 bytes long.
What I'm seeing is that the issue is fleeting: in the same process sometimes the database will return the error (Data blob not found) and sometimes it will work as if nothing happened.
But what exactly do you do to see it? I thought that it is related to the recovery process, which means you have to insert keys and then terminate your application without closing the database. Then open the database, recovery is performed but the database is corrupt.
Is there a way for me to run your application?
Thanks Christoph
2012/7/10 Artem < reply@reply.github.com
What I'm seeing is that the issue is fleeting: in the same process sometimes the database will return the error (Data blob not found) and sometimes it will work as if nothing happened.
Reply to this email directly or view it on GitHub: https://github.com/cruppstahl/hamsterdb/issues/8#issuecomment-6861392
But what exactly do you do to see it?
Well, to answer this question I've instrumented the program to log all the Hamsterdb calls. Here's what I see:
2012-07-10 09:29:46.63825 [11276.47445114894080] INFO frople.Session - ham_env_open_ex/ham_env_create_ex (HAM_CACHE_UNLIMITED | HAM_ENABLE_TRANSACTIONS | HAM_AUTO_RECOVERY): 0
2012-07-10 09:29:46.63864 [11276.47445114894080] INFO frople.Session - ham_env_open_db/ham_env_create_db (HAM_USE_BTREE; HAM_PARAM_KEYSIZE=16): 0
2012-07-10 09:29:46.64286 [11276.47445114894080] INFO frople.Session - ham_txn_begin@getLocal: 0
2012-07-10 09:29:46.64339 [11276.47445114894080] INFO frople.Session - ham_find (HAM_FIND_GT_MATCH): 0
2012-07-10 09:29:46.64375 [11276.47445114894080] INFO frople.Session - ham_txn_commit@getLocal: 0
2012-07-10 09:29:55.38495 [11276.47445120145152] INFO frople.Session - ham_txn_begin@getLocal: 0
2012-07-10 09:29:55.38559 [11276.47445120145152] INFO frople.Session - ham_find (HAM_FIND_GT_MATCH): 0
2012-07-10 09:29:55.38622 [11276.47445120145152] INFO frople.Session - ham_txn_commit@getLocal: 0
2012-07-10 09:29:57.01055 [11276.47445116995328] INFO frople.Session - ham_txn_begin@getLocal: 0
2012-07-10 09:29:57.01108 [11276.47445116995328] INFO frople.Session - ham_find (HAM_FIND_GT_MATCH): 0
2012-07-10 09:29:57.01159 [11276.47445116995328] INFO frople.Session - ham_txn_commit@getLocal: 0
2012-07-10 11:56:36.72545 [26391.47557329995520] INFO frople.Session - ham_env_open_ex/ham_env_create_ex (HAM_CACHE_UNLIMITED | HAM_ENABLE_TRANSACTIONS | HAM_AUTO_RECOVERY): 0
2012-07-10 11:56:36.72591 [26391.47557329995520] INFO frople.Session - ham_env_open_db/ham_env_create_db (HAM_USE_BTREE; HAM_PARAM_KEYSIZE=16): 0
2012-07-10 11:56:36.73148 [26391.47557329995520] INFO frople.Session - ham_txn_begin@getLocal: 0
2012-07-10 11:56:36.73202 [26391.47557329995520] INFO frople.Session - ham_find@getLocal (HAM_FIND_GT_MATCH): -16
2012-07-10 11:56:36.73250 [26391.47557329995520] INFO frople.Session - ham_txn_abort@getLocal: 0
2012-07-10 11:56:36.73349 [26391.47557329995520] WARN tntnet.worker - http-Error: 500 !ham_find: Data blob not found
2012-07-10 11:57:01.39756 [26391.47557343651584] INFO frople.Session - ham_txn_begin@getLocal: 0
2012-07-10 11:57:01.39793 [26391.47557343651584] INFO frople.Session - ham_find@getLocal (HAM_FIND_GT_MATCH): -16
2012-07-10 11:57:01.39839 [26391.47557343651584] INFO frople.Session - ham_txn_abort@getLocal: 0
2012-07-10 11:57:01.39928 [26391.47557343651584] WARN tntnet.worker - http-Error: 500 !ham_find: Data blob not found
Also on the STDERR I see
6/10 9:29 img2::Conf: tntnet: my pid is 11276
blob_read: blob 33664 not found
blob_read: blob 33664 not found
blob_read: blob 33664 not found
blob_read: blob 33664 not found
blob_read: blob 33664 not found
terminate called without an active exception
So yes, it looks to me like the the program crashed after 9:29, following the mysterious "blob_read" messages, and after the crash the ham_find calls no longer work.
On a side note, this database is kept on a tmpfs filesystem.
Is there a way for me to run your application?
It's a tntnet-based web project with a lot of dependencies. Making an identical installation looks like too much trouble for now.
Hi Artem,
your program starts with opening the Environment and opening the Databases, but how are they created and filled? I think that the database is corrupted during the generation, and not during the sequence that you displayed.
btw - there are two tools in the hamsterdb/tools directory: ham_dump and
ham_info. You can use them for diagnostics, i.e. "ham_info -f
Bye Christoph
2012/7/10 Artem < reply@reply.github.com
But what exactly do you do to see it?
Well, to answer this question I've instrumented the program to log all the Hamsterdb calls. Here's what I see:
2012-07-10 09:29:46.63825 [11276.47445114894080] INFO frople.Session -
ham_env_open_ex/ham_env_create_ex (HAM_CACHE_UNLIMITED | HAM_ENABLE_TRANSACTIONS | HAM_AUTO_RECOVERY): 0 2012-07-10 09:29:46.63864 [11276.47445114894080] INFO frople.Session - ham_env_open_db/ham_env_create_db (HAM_USE_BTREE; HAM_PARAM_KEYSIZE=16): 0 2012-07-10 09:29:46.64286 [11276.47445114894080] INFO frople.Session - ham_txn_begin@getLocal: 0 2012-07-10 09:29:46.64339 [11276.47445114894080] INFO frople.Session - ham_find (HAM_FIND_GT_MATCH): 0 2012-07-10 09:29:46.64375 [11276.47445114894080] INFO frople.Session - ham_txn_commit@getLocal: 0 2012-07-10 09:29:55.38495 [11276.47445120145152] INFO frople.Session - ham_txn_begin@getLocal: 0 2012-07-10 09:29:55.38559 [11276.47445120145152] INFO frople.Session - ham_find (HAM_FIND_GT_MATCH): 0 2012-07-10 09:29:55.38622 [11276.47445120145152] INFO frople.Session - ham_txn_commit@getLocal: 0 2012-07-10 09:29:57.01055 [11276.47445116995328] INFO frople.Session - ham_txn_begin@getLocal: 0 2012-07-10 09:29:57.01108 [11276.47445116995328] INFO frople.Session - ham_find (HAM_FIND_GT_MATCH): 0 2012-07-10 09:29:57.01159 [11276.47445116995328] INFO frople.Session - ham_txn_commit@getLocal: 0 2012-07-10 11:56:36.72545 [26391.47557329995520] INFO frople.Session - ham_env_open_ex/ham_env_create_ex (HAM_CACHE_UNLIMITED | HAM_ENABLE_TRANSACTIONS | HAM_AUTO_RECOVERY): 0 2012-07-10 11:56:36.72591 [26391.47557329995520] INFO frople.Session - ham_env_open_db/ham_env_create_db (HAM_USE_BTREE; HAM_PARAM_KEYSIZE=16): 0 2012-07-10 11:56:36.73148 [26391.47557329995520] INFO frople.Session - ham_txn_begin@getLocal: 0 2012-07-10 11:56:36.73202 [26391.47557329995520] INFO frople.Session - ham_find@getLocal (HAM_FIND_GT_MATCH): -16 2012-07-10 11:56:36.73250 [26391.47557329995520] INFO frople.Session - ham_txn_abort@getLocal: 0 2012-07-10 11:56:36.73349 [26391.47557329995520] WARN tntnet.worker - http-Error: 500 !ham_find: Data blob not found 2012-07-10 11:57:01.39756 [26391.47557343651584] INFO frople.Session - ham_txn_begin@getLocal: 0 2012-07-10 11:57:01.39793 [26391.47557343651584] INFO frople.Session - ham_find@getLocal (HAM_FIND_GT_MATCH): -16 2012-07-10 11:57:01.39839 [26391.47557343651584] INFO frople.Session - ham_txn_abort@getLocal: 0 2012-07-10 11:57:01.39928 [26391.47557343651584] WARN tntnet.worker - http-Error: 500 !ham_find: Data blob not found
Also on the STDERR I see
6/10 9:29 img2::Conf: tntnet: my pid is 11276 blob_read: blob 33664 not found blob_read: blob 33664 not found blob_read: blob 33664 not found blob_read: blob 33664 not found blob_read: blob 33664 not found terminate called without an active exception
So yes, it looks to me like the the program crashed after 9:29, following the mysterious "blob_read" messages, and after the crash the ham_find calls no longer work.
On a side note, this database is kept on a tmpfs filesystem.
Is there a way for me to run your application?
It's a tntnet-based web project with a lot of dependencies. Making an identical installation looks like too much trouble for now.
Reply to this email directly or view it on GitHub: https://github.com/cruppstahl/hamsterdb/issues/8#issuecomment-6870076
I had dangling transactions, that might be the reason. Now I've wrote a RAII wrapper around transaction. Will see if the problem would resurface or not.
I'm currently writing a test program which simulates that. I have a few more questions, then i won't bother you again (i hope) :)
As far as i understand you fill your database, then terminate the program (with dangling transactions). Then you open the file again and it's corrupted.
For me it's critical to understand what you do when you fill the database.
Thanks again for your help, Christoph
2012/7/11 Artem < reply@reply.github.com
I had dangling transactions, that might be the reason. Now I've wrote a RAII wrapper around transaction. Will see if the problem would resurface or not.
Reply to this email directly or view it on GitHub: https://github.com/cruppstahl/hamsterdb/issues/8#issuecomment-6909172
It's one database, all the configuration options are already mentioned in the log sample I've provided. No duplicate keys. Old record is always erased from the cache before inserting a new one, in a single transaction. As I've said, I can show the database-related code. Hope this helps.
12.07.2012 22:47 ÐÏÌØÚÏ×ÁÔÅÌØ "Christoph Rupp" < reply@reply.github.com> ÎÁÐÉÓÁÌ:
I'm currently writing a test program which simulates that. I have a few more questions, then i won't bother you again (i hope) :)
As far as i understand you fill your database, then terminate the program (with dangling transactions). Then you open the file again and it's corrupted.
For me it's critical to understand what you do when you fill the database.
Thanks again for your help, Christoph
2012/7/11 Artem < reply@reply.github.com
I had dangling transactions, that might be the reason. Now I've wrote a RAII wrapper around transaction. Will see if the problem would resurface or not.
Reply to this email directly or view it on GitHub: https://github.com/cruppstahl/hamsterdb/issues/8#issuecomment-6909172
Reply to this email directly or view it on GitHub: https://github.com/cruppstahl/hamsterdb/issues/8#issuecomment-6944045
Sure, if you can share the code it might be helpful.
2012/7/12 Artem < reply@reply.github.com
It's one database, all the configuration options are already mentioned in the log sample I've provided. No duplicate keys. Old record is always erased from the cache before inserting a new one, in a single transaction. As I've said, I can show the database-related code. Hope this helps.
12.07.2012 22:47 ÐÏÌØÚÏ×ÁÔÅÌØ "Christoph Rupp" < reply@reply.github.com> ÎÁÐÉÓÁÌ:
I'm currently writing a test program which simulates that. I have a few more questions, then i won't bother you again (i hope) :)
As far as i understand you fill your database, then terminate the program (with dangling transactions). Then you open the file again and it's corrupted.
For me it's critical to understand what you do when you fill the database.
- do you have multiple databases or just one?
- how are they configured? the keys are extended (36 bytes, i think?) and what's the avg. record size?
- do you use duplicate keys? approx. how much % of your keys are duplicates? can you roughly estimate how many keys you inserted before you saw that failure?
- do you also erase keys? if yes, can you roughly estimate how many you erased before you saw the failure?
Thanks again for your help, Christoph
2012/7/11 Artem < reply@reply.github.com
I had dangling transactions, that might be the reason. Now I've wrote a RAII wrapper around transaction. Will see if the problem would resurface or not.
Reply to this email directly or view it on GitHub: https://github.com/cruppstahl/hamsterdb/issues/8#issuecomment-6909172
Reply to this email directly or view it on GitHub: https://github.com/cruppstahl/hamsterdb/issues/8#issuecomment-6944045
Reply to this email directly or view it on GitHub: https://github.com/cruppstahl/hamsterdb/issues/8#issuecomment-6945384
I've emailed you the code.
I have received it, thank you. I already wrote a test application with the same parameters but so far i am not able to reproduce any problem. But i'll keep on trying.
2012/7/14 Artem < reply@reply.github.com
I've emailed you the code.
Reply to this email directly or view it on GitHub: https://github.com/cruppstahl/hamsterdb/issues/8#issuecomment-6982102
Hi Artem,
i fixed it and i just released 2.0.4.
Thanks for your great and patient help! Christoph
2012/7/15 Christoph Rupp chris@crupp.de
I have received it, thank you. I already wrote a test application with the same parameters but so far i am not able to reproduce any problem. But i'll keep on trying.
2012/7/14 Artem < reply@reply.github.com
I've emailed you the code.
Reply to this email directly or view it on GitHub: https://github.com/cruppstahl/hamsterdb/issues/8#issuecomment-6982102
Glad to hear, Christoph! Thanks!
I'm getting a "Data blob not found" error from the ham_find. Makes no sense to me. Database: http://glim.ru/personal/bugs/FUC.7z