byu-dml / d3m-dynamic-neural-architecture

1 stars 1 forks source link

Ensure training set has all primitives #206

Closed epeters3 closed 4 years ago

epeters3 commented 4 years ago

This PR ensures a training data set includes all primitives found in the full dataset, while still ensuring that the train/test split is otherwise as random as possible. This prevents errors related to models running into unseen primitives at test time.

This PR also ensures that when using the autosklearn metamodel, all datasets and their metafeatures are preserved in the metadataset.

codecov-commenter commented 4 years ago

Codecov Report

Merging #206 into develop will increase coverage by 0.48%. The diff coverage is 82.08%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #206      +/-   ##
===========================================
+ Coverage    55.12%   55.60%   +0.48%     
===========================================
  Files           36       36              
  Lines         2712     2764      +52     
===========================================
+ Hits          1495     1537      +42     
- Misses        1217     1227      +10     
Impacted Files Coverage Δ
dna/models/baselines.py 33.51% <0.00%> (-0.19%) :arrow_down:
dna/data.py 72.78% <75.67%> (-0.54%) :arrow_down:
dna/__main__.py 24.39% <100.00%> (ø)
dna/models/base_models.py 46.40% <100.00%> (ø)
dna/utils.py 88.40% <100.00%> (+2.19%) :arrow_up:
test/test_utils.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 0046b04...abe5688. Read the comment docs.