czbiohub-sf / orpheum

Orpheum (Previously called and published under sencha) is a Python package for directly translating RNA-seq reads into coding protein sequence.
MIT License
18 stars 4 forks source link

Convert pandas/apache parquet to vaex-compatible format #25

Closed olgabot closed 4 years ago

olgabot commented 4 years ago

Read in a pandas/apache parquet file, output a vaex-compatible hdf5, which opens lazily, halving the read and write time

codecov-io commented 4 years ago

Codecov Report

Merging #25 into master will increase coverage by 2.11%. The diff coverage is 48.78%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #25      +/-   ##
==========================================
+ Coverage   45.83%   47.95%   +2.11%     
==========================================
  Files          20       21       +1     
  Lines        1213     1345     +132     
==========================================
+ Hits          556      645      +89     
- Misses        657      700      +43
Impacted Files Coverage Δ
khtools/commandline.py 86.66% <100%> (+2.05%) :arrow_up:
khtools/pandas2vaex.py 46.15% <46.15%> (ø)
khtools/compare_kmer_content.py 60.73% <0%> (-0.02%) :arrow_down:
khtools/sequence_encodings.py 98.82% <0%> (+0.82%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 6075dee...e261d9e. Read the comment docs.

olgabot commented 4 years ago

Not doing this now - low priority