Closed eliran-lightricks closed 5 years ago
Just a heads up, Redshift considers REAL
to be FLOAT4
(or float32()
in SQLAlchemy). I think if you use float64()
you will have issues reading the data with Spectrum.
@eliran-lightricks from reading the documentation I think @andrewgross is correct, REAL columns should use type pa.float32()
.
Relevant documentation below -- note that REALs consume 4 bytes of space, indicating pa.float32()
is sufficient.
https://docs.aws.amazon.com/redshift/latest/dg/r_Numeric_types201.html
Would you mind testing with pa.float32()
and see how it goes?
I did some tests and it seems to work with uploading to S3 and copying back. But for some reason the tests are failing, trying to figure out exactly where.
@eliran-lightricks try using self.assertAlmostEqual(self.data, written_data, places=5)
I think it's just the floating point precision causing issues with strict comparison
@c-nichols Had to do some alterations since self.assertAlmostEqual
can't work directly on lists
. Let me know if that's okay with you.
@c-nichols Did you release this on PyPI? I pip install
'd this today, and this error cropped up -- got fixed when I applied the diff from the PR.
Releasing is still on my todo list :see_no_evil: I will do so within the next couple days
Please do! Don't make me monkeypatch this on production 😂
@anomitra new release is up on pypi. "a few" == 18 apparently
Trying to solve #37