byu-dml / d3m-experimenter

A distributed system for creating, running, and persisting many machine learning experiments.
0 stars 0 forks source link

Use BYU Fork of Dsbox Encoder #75

Closed epeters3 closed 5 years ago

epeters3 commented 5 years ago

Using a fork of d3m.primitives.data_preprocessing.encoder.DSBOX (unit tests included in this PR) to give us these benefits:

codecov-io commented 5 years ago

Codecov Report

Merging #75 into develop will increase coverage by 2.29%. The diff coverage is 82.35%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop      #75      +/-   ##
===========================================
+ Coverage    65.56%   67.85%   +2.29%     
===========================================
  Files           28       31       +3     
  Lines         1661     1910     +249     
===========================================
+ Hits          1089     1296     +207     
- Misses         572      614      +42
Impacted Files Coverage Δ
test/test_pipeline_builder.py 97.97% <ø> (ø) :arrow_up:
experimenter/pipeline_builder.py 93.87% <ø> (-0.28%) :arrow_down:
test/config.py 100% <ø> (ø) :arrow_up:
test/test_pipeline_generator.py 100% <100%> (ø) :arrow_up:
experimenter/constants.py 100% <100%> (ø) :arrow_up:
experimenter/config.py 100% <100%> (ø) :arrow_up:
experimenter/primitives/dsbox_encoder.py 78.44% <78.44%> (ø)
test/test_dsbox_encoder.py 90.32% <90.32%> (ø)
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 73cd314...895a5c2. Read the comment docs.

epeters3 commented 5 years ago

Hey @orionw, would you be willing to review this PR? If looks good, go ahead and merge it, and try including the code in your efforts to run the experiments. This PR should get rid of those errors we were getting yesterday where sometimes the encoders would run into unseen values in the test data sets, and the errors that were happening when a dataset had no categorical attributes.