queryverse / ParquetFiles.jl

FileIO.jl integration for Parquet files
Other
19 stars 10 forks source link

use Parquet.jl named tuples iterator: RecordCursor #25

Open tanmaykm opened 4 years ago

tanmaykm commented 4 years ago

This Parquet.jl update will add a named tuple iterator RecordReader. We will be able to use that here directly, instead of wrapping over the older RecCursor.

The new map_logical_types option to ParFile automatically converts byte arrays to strings, so we do not need to handle that here now.

This can be merged after JuliaIO/Parquet.jl#71 is released in Parquet.jl

blairn commented 4 years ago

I think this broke live.

julia> using Queryverse [ Info: Precompiling Queryverse [612083be-0b0f-5412-89c1-4e7c75506a58] [ Info: Installing xlrd via the Conda xlrd package... [ Info: Running conda install -y xlrd in root environment Collecting package metadata (current_repodata.json): done Solving environment: done

Package Plan

environment location: /home/blair/.julia/conda/3

added / updated specs:

The following packages will be downloaded:

package                    |            build
---------------------------|-----------------
xlrd-1.2.0                 |           py37_0         175 KB
------------------------------------------------------------
                                       Total:         175 KB

The following NEW packages will be INSTALLED:

xlrd pkgs/main/linux-64::xlrd-1.2.0-py37_0

Downloading and Extracting Packages xlrd-1.2.0 | 175 KB | ################################################################################################################################################################ | 100% Preparing transaction: done Verifying transaction: done Executing transaction: done ERROR: LoadError: UndefVarError: RecCursor not defined Stacktrace: [1] top-level scope at /home/blair/.julia/packages/ParquetFiles/cLLFb/src/ParquetFiles.jl:26 [2] include(::Module, ::String) at ./Base.jl:377 [3] top-level scope at none:2 [4] eval at ./boot.jl:331 [inlined] [5] eval(::Expr) at ./client.jl:449 [6] top-level scope at ./none:3 in expression starting at /home/blair/.julia/packages/ParquetFiles/cLLFb/src/ParquetFiles.jl:26 ERROR: LoadError: Failed to precompile ParquetFiles [46a55296-af5a-53b0-aaa0-97023b66127f] to /home/blair/.julia/compiled/v1.4/ParquetFiles/WDBBU_eEoIX.ji. Stacktrace: [1] error(::String) at ./error.jl:33 [2] compilecache(::Base.PkgId, ::String) at ./loading.jl:1272 [3] _require(::Base.PkgId) at ./loading.jl:1029 [4] require(::Base.PkgId) at ./loading.jl:927 [5] require(::Module, ::Symbol) at ./loading.jl:922 [6] include(::Module, ::String) at ./Base.jl:377 [7] top-level scope at none:2 [8] eval at ./boot.jl:331 [inlined] [9] eval(::Expr) at ./client.jl:449 [10] top-level scope at ./none:3 in expression starting at /home/blair/.julia/packages/Queryverse/ysqbZ/src/Queryverse.jl:15 ERROR: Failed to precompile Queryverse [612083be-0b0f-5412-89c1-4e7c75506a58] to /home/blair/.julia/compiled/v1.4/Queryverse/hLJnW_eEoIX.ji. Stacktrace: [1] error(::String) at ./error.jl:33 [2] compilecache(::Base.PkgId, ::String) at ./loading.jl:1272 [3] _require(::Base.PkgId) at ./loading.jl:1029 [4] require(::Base.PkgId) at ./loading.jl:927 [5] require(::Module, ::Symbol) at ./loading.jl:922

blairn commented 4 years ago

Rec cursor isn't defined any more, but, it is being referred to.

tanmaykm commented 4 years ago

I think merging this would fix that issue. Parquet.jl has moved to v0.5 where there is no RecCursor, and this PR addresses that.

I feel there should be an upperbound to the compat version range here so that we don't run into this issue again. I shall push a commit with that in a bit.

tanmaykm commented 4 years ago

@davidanthoff does this look good to be merged?

xiaodaigh commented 4 years ago

I'd say merge this with master to fix the conflicts in Project.toml which has proper upper bounding.

xiaodaigh commented 4 years ago

merging this will fix #26 and all the file read issues