Open mli opened 4 years ago
@astonzhang @smolix @AnirudhDagar @archersama
Thanks. The idea is very good and the workaround of breaking a class def into multiple cells may address our earlier concerns. One thing we need to think carefully is the design of the Trainer
base class, such as its fit
method: D2L has many different use cases such as sequence to sequence and object detection.
As @astonzhang suggested, I have the same concerns with the design of the Trainer
base class fit
method which needs to handle various scenarios.
Other than that, type hints will surely be a great addition for type checking using mypy and it also adds to the readability of the code, making it more understandable. We should definitely add it to the key APIs.
Also fstrings
in python3
again help the print statements be more readable and concise, so there is no reason not to go ahead with refactoring these.
The idea of keeping all the framework implementations in a single library looks good, but this will make things a bit more complex in terms of mantainence of the single framework with multiple dependencies. This is a design choice which we should discuss in detail.
Yes, the API redesign, will definitely look good for the mini framework-agnostic library that d2l can become. But, we do not want to over-design things and somehow end up with complex APIs. This may also hinder the understanding of DL concepts along with code for people who are just starting and are very new to the world of Deep Learning.
we can have multiple version of Trainer
, for example, the basic CPU trainer BasicTrainer
, the multi-gpu trainer Trainer
, and others. The idea is to reuse codes.
We don't need to have a single trainer to support every framework, but make them as similar as possible. I'm thinking about have all codes in the d2l
module, then
from d2l import mxnet as d2l # for mxnet implementations
from d2l import torch as d2l # for pytorch implementations
from d2l import tensorflow as d2l # for tensorflow implementations
Background
Most d2l chapters follows the data/network/loss/training/prediction structure. Codes are reused by saving into the d2l package. However, APIs are not consistent and some functions have a lot of arguments. Some users are using the d2l package in their products after learning the materials. So it's good to have d2l to be a mini deep learning framework, and also improves its readability.
Design Principals
API design
Inspired by pytorch-lightning, we can create an API
Trainer
class:Now implement a particular algorithm
Plugin a particular network
Expand the class definition into multiple code cells
One main reason we didn't use class too much is that the implementation of a class is long, we want to break it into multiple code cells in jupyter.
Recently I found a way to do it:
Cell 1:
Cell 2:
Cell 3:
Python 3 features
Later on, we can use mypy to check the type. It may be too verbose to add typing for every function, but we should enforce it for key APIs
We rewrite
"loss %.3f, name %s".format(loss, name)
intof"loss {loss:.3f}, name {name}"
.Open Questions
There are tiny difference between MX/PT/TF on data loader, optimizer, .... If we can share an almost identical Trainer definition. Especially for TF 2.0, Keras already provide a high-level fit function. If or not it's necessary
We now maintain two libraries, d2l for mxnet implementation and d2l_pytorch for pytorch. If or not we want to put them into a single library. Such as in mxnet, we save code by "Save into the d2l.mxnet package". To use it, just
from d2l import mxnet as d2l