I've tested potential fixes for this but I suspect the results may be incorrect because the values aren't what I expected. I tried calling an old version of read_plain_int96 multiple times, but that didn't produce values I expected either.
Does anybody have a good test case for this or know what the format ought to be?
(most recent call last):
File "/usr/lib64/python2.7/pdb.py", line 1314, in main
pdb._runscript(mainpyfile)
File "/usr/lib64/python2.7/pdb.py", line 1233, in _runscript
self.run(statement)
File "/usr/lib64/python2.7/bdb.py", line 400, in run
exec cmd in globals, locals
File "<string>", line 1, in <module>
File "transformParquet.py", line 1, in <module>
import parquet
File "/home/ec2-user/poll-pull-transform-parquet/pptp/local/lib/python2.7/site-packages/parquet/__init__.py", line 379, in DictReader
for row in reader(fo, columns):
File "/home/ec2-user/poll-pull-transform-parquet/pptp/local/lib/python2.7/site-packages/parquet/__init__.py", line 433, in reader
dict_items = read_dictionary_page(fo, ph, cmd)
File "/home/ec2-user/poll-pull-transform-parquet/pptp/local/lib/python2.7/site-packages/parquet/__init__.py", line 359, in read_dictionary_page
page_header.dictionary_page_header.num_values)
File "/home/ec2-user/poll-pull-transform-parquet/pptp/local/lib/python2.7/site-packages/parquet/encoding.py", line 88, in read_plain
return conv(fo, count)
File "/home/ec2-user/poll-pull-transform-parquet/pptp/local/lib/python2.7/site-packages/parquet/encoding.py", line 46, in read_plain_int96
items = struct.unpack("<qi" * count, fo.read(12) * count)
error: bad char in struct format
"<qi" * count
produces something like<qi<qi<qi
The docs indicate that the first character of the format string can be used to indicate the byte order, size and alignment.
I've tested potential fixes for this but I suspect the results may be incorrect because the values aren't what I expected. I tried calling an old version of read_plain_int96 multiple times, but that didn't produce values I expected either.
Does anybody have a good test case for this or know what the format ought to be?