FFroehlich commented 3 years ago

Adds an implementation for pretraining. With this setup Model training with 10 local starts for the full problem can be done in <1h with 4 local cores and not using parallelization in all steps.

Will add more documentation over the coming days.

codecov-io commented 3 years ago

Codecov Report

Merging #16 (8970883) into master (37a2310) will decrease coverage by 12.42%. The diff coverage is 64.63%.

@@             Coverage Diff             @@
##           master      #16       +/-   ##
===========================================
- Coverage   95.76%   83.33%   -12.43%     
===========================================
  Files           7        9        +2     
  Lines         354      414       +60     
===========================================
+ Hits          339      345        +6     
- Misses         15       69       +54

Flag	Coverage Δ
unittests	`83.33% <64.63%> (-12.43%)`	:arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
mEncoder/pretraining.py	`0.00% <0.00%> (ø)`
mEncoder/training.py	`92.15% <86.66%> (-1.03%)`	:arrow_down:
mEncoder/encoder.py	`94.28% <91.30%> (+1.42%)`	:arrow_up:
mEncoder/autoencoder.py	`100.00% <100.00%> (+1.36%)`	:arrow_up:
mEncoder/generate_data.py	`100.00% <100.00%> (ø)`
mEncoder/mechanistic_model.py	`90.81% <100.00%> (-0.78%)`	:arrow_down:
mEncoder/petab_subproblem.py	`100.00% <100.00%> (ø)`
mEncoder/test/test_encoder.py	`100.00% <100.00%> (ø)`
mEncoder/test/test_model.py	`100.00% <100.00%> (ø)`
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 37a2310...8970883. Read the comment docs.

FFroehlich commented 3 years ago

This fixes multiple bugs that led to an incorrect formulation of the whole problem. Training for a medium size model (19 proteins, 22 phospho sites) full training runs in 2h on a desktop machine with 4 cores without using parallelization in all steps.

data: synthetic__FLT3_MAPK_AKT_STAT.pdf

fit: FLT3_MAPK_AKT_STATsynthetic2fidesfit.pdf

aarmey commented 3 years ago

@FFroehlich I've updated the testing on master to use a self-hosted runner, to get around the env changes with Github Actions. Let me know if you have any difficulties once you merge these changes.

FFroehlich commented 3 years ago

@FFroehlich I've updated the testing on master to use a self-hosted runner, to get around the env changes with Github Actions. Let me know if you have any difficulties once you merge these changes.

Oh sorry, I already fixed everything necessary to adapt to the new env API on GHA, so that wouldn't have been necessary. With the self-hosted runner the individual jobs seem to be queue for quite a while.

aarmey commented 3 years ago

Our lab's queue definitely varies from day to day, and is behind today... absolutely feel free to change it back if you'd like.

meyer-lab / mechanismEncoder

implement pretraining pipeline #16

Codecov Report