Closed fritzgrabo closed 2 years ago
Hey! Great work finding the issue and proposing a fix! I think that you're right this was not intended. I'll take a closer look today to make sure but if all existing tests pass plus your new one then there's a good chance you're right.
Looks good! It's just failing because you don't have access to github actions secrets. Merging. Thank you!
Updated dsq to vendor in this change: https://github.com/multiprocessio/dsq/pull/40.
New version of dsq is out now, https://github.com/multiprocessio/dsq/releases/tag/0.11.0.
@eatonphil That's awesome, thank you! I'll update the affected examples in ob-dsq to reflect the fix later tonight.
PS: Really glad this was such a quick and painless fix in the end. It's also kind of a testament to the quality of the code of dsq and DataStation: I've had no prior exposure to go
whatsoever (read or write) and could find and fix this in a single evening. Yay!
Wow that's awesome you could figure it out without knowing Go. And the shape library is basically the trickiest part of the code so again, great work!
Hi 👋
While working on ob-dsq, I noticed that getting the "Shape" of a JSON file converts shapes of a Scalar kind into a Varied kind when merging Object shapes, even when all sampled rows happen to have the same Scalar kind and type in the merged JSON attribute.
Here's an example to illustrate what I mean:
Note how the
id
column's shape got merged into a Varied kind (TEXT) even though all rows in the input file are numbers.I'm not entirely sure if this was done on purpose to err on the side of safety when working with input files that have rows of varying structure? There's a comment that seems to indicated that, but I can read it both ways :)
In this PR, I propose to try hard to keep a column's shape untouched if all the sampled rows in the input happen to have the same Scalar kind and type.
This could help with unexpected results in queries, too. For example,
"SELECT * FROM {} WHERE id=1"
yields an empty result in the unpatched version because1
doesn't match"1.0"
.Please let me know what you think. Thanks!