Open pokidovea opened 5 years ago
I agree that it's a good idea to use a python lib to read in parquet files. Editing parquet files might be a bit inefficient using any text editor.
@yuj would you be interested in taking a PR to accomplish this? I tried to do this separately as a fork (https://github.com/dogversioning/sublime-parquet-python), which changes the rendering options (the python tools I used as a first pass don't support JSON output), but the sublime text folks have a light preference to consolidate these approaches if possible.
@dogversioning PRs are always welcome! Please send it over.
Eventually I guess we all still prefer @pokidovea suggestion that uses pyarrow
to read parquet files, instead of using parquet-tools
. Anyone interested in accomplish that too? :)
@yuj yeah, i think it makes sense - this was more of an incremental approach to solve an acute issue, but something like that was next on my list of things to potentially tackle.
Anyway, give me a bit to reconcile the fork approach with a in place one and i'll open a PR.
@yuj So I spent a little time this morning looking into this - there's some tradeoffs:
Lib
folder.If the first one doesn't bother you and you're ok with the hoops on the latter (I think for something of this scope the pre-built route isn't worth the effort), than it :could: be done. But it's an open question if this makes the barrier to entry too complex.
It is not convenient to install Java-based parquet tools. There is at least one python lib for work with parquet pyarrow. There are some advantages to use this lib:
What do you think?