Closed VarIr closed 4 years ago
This pull request introduces 4 alerts when merging 3c0934d941d290e5fa2d17e7123cd59c36a75df3 into 905ae06d31d0492ba6b7675f4bf463f0295742fa - view on LGTM.com
new alerts:
This pull request introduces 3 alerts when merging 5ca4e1a3432f9a8c672df47feefeef942cb610ce into 905ae06d31d0492ba6b7675f4bf463f0295742fa - view on LGTM.com
new alerts:
Merging #16 into master will increase coverage by
1.17%
. The diff coverage is96.37%
.
@@ Coverage Diff @@
## master #16 +/- ##
==========================================
+ Coverage 95.48% 96.66% +1.17%
==========================================
Files 14 33 +19
Lines 576 2036 +1460
==========================================
+ Hits 550 1968 +1418
- Misses 26 68 +42
Impacted Files | Coverage Δ | |
---|---|---|
deepnog/learning/training.py | 85.30% <85.30%> (ø) |
|
deepnog/utils/io_utils.py | 96.36% <89.47%> (ø) |
|
deepnog/learning/inference.py | 96.00% <93.10%> (ø) |
|
deepnog/utils/network.py | 94.00% <94.00%> (ø) |
|
deepnog/models/deepfam.py | 95.75% <95.75%> (ø) |
|
deepnog/utils/tests/test_utils.py | 96.19% <96.19%> (ø) |
|
deepnog/models/deepnog.py | 96.92% <96.92%> (ø) |
|
deepnog/client/client.py | 97.54% <97.54%> (ø) |
|
deepnog/data/dataset.py | 97.80% <97.80%> (ø) |
|
deepnog/client/tests/test_cli.py | 99.38% <99.38%> (ø) |
|
... and 47 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 905ae06...59ef6b7. Read the comment docs.
This pull request introduces 1 alert when merging 3c2883c67a7ef16afa065a65e8fb8aa652f8160f into 905ae06d31d0492ba6b7675f4bf463f0295742fa - view on LGTM.com
new alerts:
This pull request introduces 1 alert when merging 7932833ab163fe736394ecff593de585e9759150 into 905ae06d31d0492ba6b7675f4bf463f0295742fa - view on LGTM.com
new alerts:
Closes #18
This pull request introduces 1 alert when merging f5363593291536d343eeda333cf9ce6ed1e414e6 into 905ae06d31d0492ba6b7675f4bf463f0295742fa - view on LGTM.com
new alerts:
So far,
deepnog
only allow to perform inference using models trained by us, the developers, in separate Jupyter notebooks. This does not scale to a larger number of models for more levels of EggNOG, or even other orthology databases.This PR introduces components that allow users to train custom models. The primary use case is taking the DeepNOG (=DeepEncoding) architecture, and train additional levels of EggNOG. Additionally, different architectures can be introduced, and trained, with reasonable effort.
At the heart of this PR is the new
training.py
that runs training and validation epochs. TheDataset
andDataLoader
classes now support labels, and a ShuffledDataset is introduced that still iterates over the FASTA file, but can shuffle the input data (s.t. a user-defined buffer size). EDIT: An additionalProteinDataset
class is introduced that features random access. This enables complete shuffling of sequences, ensuring that minibatches differ in epochs, which might affect training. This comes at the cost of first loading the complete fasta file and storing in memory. This might replace the ShuffledDataset.The client now uses two subparsers:
train
andinfer
.The general package structure was reworked to offer more intuitive modularity.
deepnog
now uses a YAML configuration file, which includes the supported databases (for inference), architectures (for training) and possibly more.