baskard / thrudb

Automatically exported from code.google.com/p/thrudb
0 stars 0 forks source link

inserting/fetching problems with mysql_glue.h #12

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
1. Configure thrudoc to use the mysql backend.
2. Hack the serialize() method to print out the size of the resulting 
serialized object. 
3. Hack the deserialize() method to print out the size of the object to be 
deserialized.
4. Use the BookmarkExample.py script to insert a bookmark.
5. When the serialize() method kicks off, you'll see the size to be much larger 
than the 
MYSQL_BACKEND_MAX_STRING_SIZE. For example 200 bytes.
6. When the deserialize() method kicks off for the corresponding bookmark, 
you'll see that the 
size is about 64 bytes. Where did the other bytes go? Of course, the 
deserialization will end up 
failing due to a corrupted data stream.

Set the MYSQL_BACKEND_MAX_STRING_SIZE to something much larger (it's gong into 
a BLOB 
column after all). e.g. 4096

Try the above steps again, we think it should pass, oh no it doesn't!

Take a close look at the serialized() value, chances are it contains embedded 
Null Characters (at 
least for my build and revisions it does).
It looks the helper string classes chokes when it hits the first null character.

So the problem is two fold, on is that the default max string size is too low - 
it probably should 
be much higher, better yet, configurable.
Aside from that, need to be able to handle embedded null characters.

Workaround (at least for me).
1) In BookmarkExample.py, I hacked up the resulting serialized() object by 
replacing the Null 
Character '\x00' with some other token.
2) Upon deserialization, I replaced the token with the '\x00' character so that 
the 
TBinaryProtocol can properly decode it.

Another workaround:
Do the hack at a lower level (inside TBinaryProtocol)
or
Don't use TBinaryProtocol (use cPickle, for example).

I don't like any of these solutions as it mucks with someone else's protocol :)

BTW, I am using the trunk head revisions of thrudb.
BTW, great product and concept, can't wait to see the codebase mature!

thanks,
Fi
coderfi@gmail.com

Original issue reported on code.google.com by fairizaz...@gmail.com on 25 Apr 2009 at 6:53

GoogleCodeExporter commented 8 years ago
Thanks for the details.  The code wasn't handling binary data corrrectly. I've 
made a fix could you verify it?

Thanks!

Original comment by jake%3.r...@gtempaccount.com on 26 Apr 2009 at 1:19

GoogleCodeExporter commented 8 years ago
Wow, that was quick, thanks!

I see you made the max string size a healthy 8096 (kB), that should work fine 
for now (though, I think you 
meant 8192)

I updated, recompiled, and reran my import script without the workaround. 

Unfortunately, it does not seem to work.
It still truncates the data when it encounters a null character.

Original comment by fairizaz...@gmail.com on 26 Apr 2009 at 2:10

GoogleCodeExporter commented 8 years ago
ok, let me install mysql and get replicate this.  i'll keep you posted...

Original comment by jake%3.r...@gtempaccount.com on 26 Apr 2009 at 2:26

GoogleCodeExporter commented 8 years ago
Ok, thanks for looking into it.

FYI, I noticed in mysql_glue.h

you have the statements:
    this->str1_length = strlen (str1);
    this->str2_length = strlen (str2);

I'm 80% sure this is one of the problems since strlen(str1) will incorrectly 
return the size of the string if there 
is an embedded null character.

Original comment by fairizaz...@gmail.com on 27 Apr 2009 at 10:00

GoogleCodeExporter commented 8 years ago
thanks for the info.  i still haven't found the time to replicate the issue but 
in the meantime I've removed the 
strlen dependency.  let me know if this helps...

Original comment by jake%3.r...@gtempaccount.com on 28 Apr 2009 at 1:10