jmboehm / Douglass.jl

Stata-like toolkit for data wrangling on Julia DataFrames
Other
51 stars 3 forks source link

Performance #11

Closed jmboehm closed 4 years ago

jmboehm commented 4 years ago

Many of the macros are generating code that is not type stable, and is therefore horrendously slow. This affects in particular

codecov[bot] commented 4 years ago

Codecov Report

Merging #11 into master will increase coverage by 1.05%. The diff coverage is 92.96%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #11      +/-   ##
==========================================
+ Coverage   76.64%   77.69%   +1.05%     
==========================================
  Files          17       18       +1     
  Lines         608      677      +69     
==========================================
+ Hits          466      526      +60     
- Misses        142      151       +9     
Impacted Files Coverage Δ
src/Douglass.jl 100.00% <ø> (ø)
src/commands/duplicates.jl 100.00% <ø> (ø)
src/helper.jl 88.23% <86.20%> (-7.22%) :arrow_down:
src/commands/erep.jl 93.33% <93.33%> (ø)
src/commands/egen.jl 95.91% <96.87%> (-1.27%) :arrow_down:
src/commands/drop.jl 100.00% <100.00%> (ø)
src/commands/generate.jl 91.42% <100.00%> (ø)
src/commands/keep.jl 100.00% <100.00%> (ø)
src/commands/replace.jl 91.89% <100.00%> (ø)
src/commands/reshape.jl 65.00% <100.00%> (+0.89%) :arrow_up:
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 09cd0bb...b4b5fa8. Read the comment docs.

jmboehm commented 4 years ago

Merging this sooner rather than later. reshape_long is still painfully slow, which I think is coming from the fact that DataFrames.stack is slow (probably not type stable).

Some bug fixes too.

Also, this is changing the version to 0.0.1, in preparation for registration.