Closed gijzelaerr closed 4 years ago
This is what buildbot uses, ie creates a binary value inserts and recreates. Thats what I expect they tested on at least mysql, postgres and sqlite. Worked with my earlier change. But not with the new code, gives builtins.TypeError: string argument without an encoding and File "/usr/lib64/python3.7/site-packages/sqlalchemy/sql/sqltypes.py", line 944, in process value = bytes(value) builtins.TypeError: string argument without an encoding
import sqlalchemy as sa
from buildbot.util import sautils
def test_unicode(migrate_engine): """Test that the database can handle inserting and selecting Unicode"""
submeta = sa.MetaData()
submeta.bind = migrate_engine
test_unicode = sautils.Table(
'test_unicode', submeta,
sa.Column('u', sa.Unicode(length=100)),
sa.Column('b', sa.LargeBinary),
)
test_unicode.create()
# insert a unicode value in there
u = "Frosty the \N{SNOWMAN}"
b = b'\xff\xff\x00'
ins = test_unicode.insert().values(u=u, b=b)
migrate_engine.execute(ins)
# see if the data is intact
row = migrate_engine.execute(sa.select([test_unicode])).fetchall()[0]
assert isinstance(row['u'], str)
assert row['u'] == u
assert isinstance(row['b'], bytes)
assert row['b'] == b
# drop the test table
test_unicode.drop()
I'm not really sure what to do with this copy paste. This (unicode and binary data) is also quite extensively tested in the current test suite. What is actually broken?
Hi @gijzelaerr. I'm taking a look at this and I will come back tomorrow with a more detailed reply, but see my first thoughts below.
I suspect that the problem is the following: In Python 2 (as in C) in order to handle a number of raw bytes you would use an array of chars. In Python 3 this has changed and we need to use the byte
and bytearray
types.
In principle, if I have a Python 3 byte
buffer I should be able to insert it into MonetDB as a BLOB and get it back as a byte
buffer again. I believe that this is where buildbot fails when we try to use it with MonetDB as a backend.
Again, I will take a more detailed look and come back with a more informed opinion (and hopefully some code) tomorrow.
Panos.
ok perfect @kutsurak that would be helpful. Making it properly work with both python2 and python3 is probably painful, so maybe it is just better to make a 1.3.0 release and implement this for a 2.0.0 for which I intend to drop python2 support. Additionally, the python DB API is not really specific about how to deal with bytes data, since it still dates from the py2 era (where strings and bytes were the same thing).
Hi @gijzelaerr. Sorry for me being silent on this. It has been a bit busy the past few days. I hope to spend some time on this issue this week.
One question: do you have a rough schedule for when the 1.3.0 and the 2.0.0 versions are going to be released?
Hi @kutsurak The release schedule is not time-based but issue-based. For now, it looks like this, but I'm open for requests and/or suggestions:
https://github.com/gijzelaerr/pymonetdb/milestones
I think it starts to get time for a new release, so we can also do a release now and push out a 1.4 for the remaining issues.
Hi @gijzelaerr
This is still incomplete but I've made some progress. Take a look at the code and the test results.
I have some ideas for how to fix the failure in Python 2.7 and pypy.
The failure in Python 3.4 would require a different implementation. If we need to keep it I will try to find a way to change the implementation, but this version has reached End Of Life status and I would propose that we drop it.
The failure in Python 3.7 I think is unrelated to my changes.
There are also two tests in the test_unicode.py
file that cause hanging, so they are skipped in these results. I haven't had time to dive into this problem but I will next week.
Panos.
replaced with https://github.com/gijzelaerr/pymonetdb/pull/67
Coverage increased (+0.1%) to 77.351% when pulling ece05dc694050d2c06d8283b601bea741e02fcad on njnes-master into 5d61b2704b1441fe6846cde1df2d6ba312b921bb on master.