We discuss possible ways to organize the source code and distribute our Python packages.

Terminology

First of all, a distinction must be made between various terms referring to the development and distribution process to avoid confusion. All following definitions come from either PEP, Python Reference or Python Glossary:

module: An object that serves as an organizational unit of Python code. Modules have a namespace containing arbitrary Python objects. Modules are loaded into Python by the process of importing.
package: A Python module which can contain submodules or recursively, subpackages. Technically, a package is a Python module with an __path__ attribute.
- regular package: A traditional package, such as a directory containing an __init__.py file.
- namespace package: A PEP 420 package which serves only as a container for subpackages. Namespace packages may have no physical representation, and specifically are not like a regular package because they have no __init__.py file. Namespace packages are a mechanism for splitting a single Python package across multiple directories on disk. In current Python versions, an algorithm to compute the packages __path__ must be formulated. With the enhancement proposed in PEP 420, the import machinery itself will construct the list of directories that make up the package.
importing: The process by which Python code in one module is made available to Python code in another module.
distribution: A separately installable set of Python modules as stored in the Python package index (PyPI), and installed by distutils or setuptools.

Links

[1] Differences between namespace packages and regular packages, from PEP 420 [2] Packaging namespace packages from PyPA's Python Packaging User Guide [3] Namespace packages, from setuptools [4] Import system, packages, regular packages, namespace packages from Python Reference [5] PEP 328 -- Imports: Multi-Line and Absolute/Relative [6] PEP 366 -- Main module explicit relative imports [7] PEP 420 -- Implicit Namespace Packages

User and Developer requirements

User wants to use an algorithm:
- User must have installed in one's environment a Python module which contains a BaseAlgorithm implementation corresponding to one's preferred algorithm.
- User must have installed in one's environment all sufficient dependencies that make the aforementioned implementation functional.
- User must write a configuration file specifying the desired algorithm implementation through the means of a "unique" identifier (i.e. a "unique" string). [In current implementation this is the subclass' name; case-independent].
Developer wants to develop an algorithm:
- Developer must create a module in (namespace) package metaopt.algo. This module contains a class which interfaces BaseAlgorithm.
- Developer must declare any implementation's dependencies to the distribution's setup.py.
- Developer should enlist the implementation class as an entry point under group name OptimizationAlgorithm. This is a way that a class can be made discoverable to metaopt.core package.

Current state of affairs (14/02/2018)

Software ecosystem is organized is under package name metaopt which is a namespace package, composed by metaopt.core, metaopt.algo and metaopt.client subpackages.

metaopt.core: This regular package contains packages and modules which implement core functionality and console scripts.
metaopt.algo: Self-contained namespace package which contains base.py module, space.py module and any possible algorithm implementation.
metaopt.client: Regular package with helper code to be used from native Python user scripts to communicate with parent process from metaopt.core. Contains function report_results and in the future it will also contain functions for online statistics reporting during an evaluation of a trial, as well as receiving (possibly) response from parent process (optimization algorithm).

Optimization algorithms ought to interface BaseAlgorithm class contained in metaopt.algo.base module and advertise their implementations as entry points in setup.py of their distribution under group name OptimizationAlgorithm.

Implementation note: OptimizationAlgorithm (code) is the name of the class which subclasses BaseAlgorithm and is created by metaclass Factory (code). Factory functionality has as follows:

It searches for modules in the same path as a base class' package and imports them.
It uses pkg_resources.iter_entry_points (from package setuptools) to find any advertised entry points under the group name which coincides to be the Factory-type subclass' name (in this case OptimizationAlgorithm)
Steps (1) and (2) allow Python to have correct and complete knowledge of all immediate subclasses of base class' subclasses; so we access this information and save it using cls.__base__.__subclasses__().
When Factory is called to create another class, it checks __call__ parameter of_type against known subclasses names and if found, it calls the corresponding class and returns an object instance.

This Factory pattern is being reused 3 times in total in current metaopt's code. The corresponding names of the Factory-typed classes are: Database in metaopt.core.io.database package, Converter in metaopt.core.io.convert module, and OptimizationAlgorithm in metaopt.algo.base module.

Links

[1] Metaclasses from Python Reference. [2] A nice blogpost with UML-like diagrams to understand Python data model structure and flow (based around metaclasses). [3] Another nice blogpost to complement setuptools' reference about entry points.

Proposals

The following proposals discuss possible solutions to the problem of managing the code source and distributing software of BaseAlgorithm implementations which the core developing team develops. From the state of affairs above, it is apparent that if an external contributor or researcher wishes to extend software with an algorithm of one's own, but do not contribute it, it can be achievable easily and without having knowledge other than what is contained in namespace package metaopt.algo.

In addition, in any of the following schemes, it makes sense - although the following statement is up for discussion if needed - that any 'trivial' implementation (e.g. metaopt.algo.random module) that has dependencies on a subset of metaopt.core dependencies to reside in the "core" distribution.

Multiple Repositories, Multiple Distributions

Abstract: This scheme suggests that extensions should be grouped according to their external (to metaopt.core software) dependencies and be independently developed and distributed.

Multiple Repositories, Single Distribution

Abstract: This scheme suggests that extensions should be grouped analogously to the first scheme, be independently developed but distributed centrally through the means of git submodule.

Single Repository, Single Distribution

Abstract: This scheme suggests that extensions, which have internal dependencies and are core, should be placed in a central package (i.e. metaopt.algo?) and contributions which perhaps have external dependencies to a separate directory (e.g. contrib). Any code is developed in the same git repository and published under the same Python distribution.

Links

[1] Creating and Discovering Plugins from PyPA's Python Packaging User Guide

If I may, I'll try to summarize the problem and solutions with their respective advantages and disadvantages. @tsirif Please feel free to correct me if I made some mistakes.

For simplicity, let's assume for now that contributed algorithms are installed in the subpackage metaopt.algo.contrib.

A) There is two different kinds of contributions:

Contributions to metaopt.core, metaopt.client and core metaopt.algo
Contributions to metaopt.algo.contrib specifically

B) Problems arise from the fact that both type of contributions have different requirements

metaopt.{core,client,algo} I) Code must be of high quality II) Should have little dependencies III) Should be stable
metaopt.algo.contrib I) Code quality requirements should be less restrictive II) Potentially have many dependencies III) Can vary a lot

C) We have so far three possible solutions (enumerated in issues' description).

Decentralized contributed algorithm repositories. I) Contributed algorithms are totally separated from the official metaopt repository and are separate python packages. I) They are discoverable through their installation setup; they should be available in metaopt namespace (metaopt.algo.contrib).
Centralized contributed algorithm repositories. I) Contributed algorithms are separated repositories but inserted as submodules inside the official metaopt repository. II) They are automatically discoverable but can only be used if the user install their requirements.
Centralized contributed algorithm package. I) Contributed algorithms are placed in a central package (metaopt.algo.contrib). II) They are automatically discoverable but can only be used if the user install their requirements.

D) Points in favor or against the different solutions

(skip to table for a summary)

Decentralized contributed algorithm repositories. +) User only install packages it wants +) Requirements can be specified for the contributed repo, making the installation simple. -) From experience installations of external package using namespace is something users have difficulty with +) Modifications do not go through metaopt official repo +) Total liberty for developpers -) Hard to keep track if users ping us for fix on external repos -) Hard to sync with external repos when new releases break things
Centralized contributed algorithm repositories. +) User can clone only the submodule it want. +) Requirements can be specified for the contributed repo, making the installation simple. +) Installation of core is easy, installation of contributions' could be easy. +/-) Modifications do not go through metaopt official repo, but the official repo must update it's submodule to keep up to date version. +/-) Total liberty for developpers, but need to announce to official repo when updates to submodule version should be done. -) Hard to keep track if users ping us for fix on external repos -) Hard to sync with external repos when new releases break things
Centralized contributed algorithm package. +/-) All algo are present in user's installation, however only those which satisfied requirements are discoverable. +/-) Installation of core is easy, installation of contributions' requirements can get messy. -) Requirements must be installed manually by users. That being said, would it be possible to threat them as packages with each their setup.py? -) Modifications do not go through metaopt official repo -) Developpers need to pass through our reviews to get their algo on the official repo. They will likely be tempted to share their metaopt fork to other users. +/-) Easier to keep track if users ping us to fix contributed algos, but it can get messy if we try to maintain to much of them. +) Easier sync with contributed algos when new releases break things

	Decentralized repos	Centralized repos	Centralized package
package-wise install	+	+	+/-
easy to install	-	+	+/-
easy requirements	+	+	-
less PR to review	+	+/-	-
developer liberty	+	+/-	-
easy to keep track	-	-	+/-
easy to sync releases	-	-	+

E) Points in favor or against the different solutions assuming we do not give any support to contributed algorithms

This means bugs from contributions issued in metaopt are ignored and bugs caused in contributions by releases are ignored. This is assuming the bug does not find its source in metaopt itself.

(skip to table for a summary)

Decentralized contributed algorithm repositories. +) User only install packages it wants +) Requirements can be specified for the contributed repo, making the installation simple. -) From experience installations of external package using namespace is something users have difficulty with +) Modifications do not go through metaopt official repo +) Total liberty for developpers
Centralized contributed algorithm repositories. +) User can clone only the submodule it want. +) Requirements can be specified for the contributed repo, making the installation simple. +) Installation of core is easy, installation of contributions' could be easy. +/-) Modifications do not go through metaopt official repo, but the official repo must update it's submodule to keep up to date version. +/-) Total liberty for developpers, but need to announce to official repo when updates to submodule version should be done.
Centralized contributed algorithm package. +/-) All algo are present in user's installation, however only those which satisfied requirements are discoverable. +/-) Installation of core is easy, installation of contributions' requirements can get messy. -) Requirements must be installed manually by users. That being said, would it be possible to threat them as packages with each their setup.py? -) Modifications do not go through metaopt official repo -) Developpers need to pass through our reviews to get their algo on the official repo. They will likely be tempted to share their metaopt fork to other users.

	Decentralized repos	Centralized repos	Centralized package
package-wise install	+	+	+/-
easy to install	-	+	+/-
easy requirements	+	+	-
less PR to review	+	+/-	-
developer liberty	+	+/-	-

My opinion

All in all, I believe the best option would be a combination of centralized package and decentralized repos. The ultimate goal is to both make installation easy for average user and allow total flexibility to developers who work on new algorithms. I believe a larger core of algorithms is the best solution for the prior and a support for distributed repos is the best solution for the latter.

There would be algorithms we provide in metaopt.algo, and external algorithms would be available after installation in metaopt.algo.contrib.whatever. As part of the core development team, we would decide which algorithms are necessary or very useful to average user and include them in the core package. The additional requirements induced by a new algorithm in the core package would be taken in consideration to determine if it is included or not, but it would not de facto exclude an algorithm from the core package.

The only support we would give for external contributions would be about how to install them in general. There would be a demo about how to make such a contribution repository which would explain how to make the installation script and how to install the external package as a discoverable algorithm for metaopt. Beside those installation technicalities, external contributions would be considered as external libraries and any bugs within those would be out of our scope, considering those bugs aren't symptoms caused by bugs inside metaopt.

This hybrid solution would have the following characteristics

easy to install core
no external package needed for most users
possibility to install external contributions
less PR/issues about external contributions
developer liberty to work on external contributions
possibly more dependencies to core package
support for users having difficulty with external contribution installation

Ok let me plan out my exact thoughts on the development and distribution scheme too:

Multiple Repositories, Multiple Distributions

Abstract: This scheme suggests that extensions should be grouped according to their external (to metaopt.core software) dependencies and be independently developed and distributed.

The main distribution is called metaopt.core and declares the metaopt.{core,algo,client} subpackages as described above.

Dependencies

Its dependencies should be minimal. This means that all of them should reflect metaopt.core package only. The only algorithm that should be discoverable just by metaopt.core distribution itself should be the metaopt.algo.random only, as it is the only one that uses only the "exported" container class metaopt.algo.space.Space.

Packages

As I explained in the introductory comment, there are three packages. I will go through more details and lay down the delegation of responsibilities.

metaopt.core: This regular package contains packages and modules which implement core functionality and console scripts.

And only this. By itself is not functional because it depends on implementations of an exposed interface. However, the interface itself is not exported through this package.
metaopt.algo: Self-contained namespace package which contains base.py module, space.py module and any possible algorithm implementation.

There must be no contrib subpackage. It contradicts the whole point of having a metaopt.algo namespace package. Every subpackage namespaced under metaopt.algo must correspond to a collection of BaseAlgorithm implementations which are semantically and dependency-wise close. I will examine possible algorithm subpackages scenarios, to show what's the purpose of this scheme:
- metaopt.algo.skopt: Every algorithm which wraps something from skopt and as a result all of them depend on it. metaopt.core distribution should not have any dependency on skopt. It is not healthy for both users and developers to include it even as a optional/extra dependency, because:
  - This solution is not scalable and it makes keeping track actually worse, if we do it like this for every algorithm implementation that the core team decides to develop and maintain.
  - Users have to get informed by the means of documentation or mailing list, if dependencies change in further developing of such subpackage. Next metaopt iterations would randomly break their functioning ecosystem just because the dependencies are optional and they should themselves have the "developer" mentality of keeping track what to install extra or not. On the contrary side, distributing separately such packages, allows the user not to think about their dependencies because newer versions will declare their own hard dependencies, resolving them successfully and in a care-free wat at pip-install time.
  - Users have to get informed by the means of documentation or mailing list, if a new implementation becomes available. New implementations would mean bumping up the version number of metaopt.core distribution even though metaopt.core functionality has not changed at all. This can potentially create confusion to both users and developers, or users who want to start developing. The reasons are three:
    - Branch and merging hell. Numpy and scipy contributors should be aware of that mess.
    - New developers do not know where to begin with. Different development cycles intertwined. Core developers spend most of their time thinking about merges and organizing releases, while they should focus on metaopt.core package and automatic interface/integration testing.
- metaopt.algo.freezethaw: An algorithm like this would be itself a meta-algorithm containing a 'bayesian-type' algorithm. It may require that the sub-algorithm implements an extra interface also in order to communicate and have general access to useful information. For example an algorithm compatible with freezethaw should perhaps be a class BayesianSomething(BaseAlgorithm) which holds (possibly) an object from a "channel" class BayesianSomethingFT(FreezeThawInterface) and whose existence within an object from BayesianSomething indicates that this object is compatible with metaopt.algo.freezethaw. The extra abstract interface comes bundled separately with metaopt.algo.freezethaw distribution/package and it should come like this as freezethaw algorithm asks for an interface which is not required in a general optimization procedure; hence it should not be bundled with metaopt.core distribution and/or inside metaopt.algo.base module. This is also a good example of how optional dependencies are meant to be used in my mind, because it describes a case of (possible) horizontal dependency between different (vertical) extensions. In this particular example, metaopt.algo.freezethaw is not a necessary component of neither metaopt.algo.base nor metaopt.algo.bayesian_something. In the first case by the property of being a vertical extension and in the second case by the property of being unnecessary for the main functionality of a bayesian algorithm. Note for extra geekiness: I used aggregation in this example over a possible inheritance solution because the latter make metaopt.algo.freezethaw a necessary component for metaopt.algo.bayesian_something, while semantically this is not the case. And because in general good OOD favors aggregation over inheritance for reasons like this.
- metaopt.algo.meta: While possible simple meta-algorithms, like for instance an AlgorithmScheduler or a RandomSelection among sub-algorithms, will not have a dependencies which do not already exist in metaopt.core distribution, they remain an unnecessary component for the main functionality, something which many users would not like to use. Maintaining a case like this as a separate repo and distribution, is in line with KISS just because it tries to keep deployment as modular and minimal as possible.
metaopt.client: Regular package with helper code to be used from native Python user scripts to communicate with parent process from metaopt.core. Contains function report_results and in the future it will also contain functions for online statistics reporting during an evaluation of a trial, as well as receiving (possibly) response from parent process (optimization algorithm).

If I was to take things in the extreme and be true to the purpose of metaopt.core, I would distribute metaopt.client separately too for the following reasons:
- It's not a necessary component again. Reason be that a black-box process can implement routines like report_results by itself (~5 lines of code, using some json interface and an environmental). So that metaopt can be used by whichever language.
- report_results is a helper function for Python users, which depends only on Python's standard library. A script to make procedure more user-friendly for the people who do not know their tools well enough and want help with that.
- In the future, it shall contain also functions for client (user's script)-server (worker/metaopt.core) communication, such as:
  - Reporting online trial statistics for stuff like freezethaw and/or just logging e.g. validation accuracies or other arbitraty quantities.
  - Since the user's script can be anything, it can also be a RL environment like Gym, Mujoco, or BabyAI platform. With the same interface it can report a reward, state measurement and (possible) a set of possible actions for this state and expect to get in return an agent's decision on the action space. The agent can be a BaseAlgorithm. Also, a total argument in favor of distributing stuff is that if for example BabyAI people decide to use metaopt to ease asynchronous agent training (e.g. A3C) they would not like to bring to their framework irrelevant dependencies.
  - Experimentation with exotic algorithm or research, from neuroevolutional stuff to reservoir computing.
  - And all the above are not necessary components either.

Development

Issues and PRs from and by the core team regarding main functionality and interface discussions only. Algorithms should be separate because research-wise or mathsy discussions might also take place. Example projects that use this kind of scheme:

Python Code Quality Authority: for example flake-8 (link)
Zope Foundation (link)

So, for the metaopt.algo.* extensions that the core team develops, the core team maintains. They cannot do else but follow progress in metaopt.core; but that's how it goes anyways, because they are extensions of an exposed interface. However, that should also leave room for externally independently maintained extensions by users. Core developers do not care about maintaining these, they care only providing stable metaopt.core releases.

Installation

Regarding arguments pointing out difficulty in installation:

Distributing packages like this does not have the pitfalls about changing optional dependencies that I raised before in point Packages/metaopt.algo/metaopt.algo.skopt/2.
What could be more clear, simple and explicit than doing just a search in PyPI, a github organization for repos, or pip search metaopt?
Smooth experience on the integration depends on the level of automating testing maturity, which we could easily and automatically have for any possible registered plugin in a separate testing repo. I have done this before in Pandora for all possible software components and modules of a UGV robot. It's relatively easy to do in a generic manner since every extension inherits from the same interface and it is supposed to effectively give the same solution to an easy - solution-independent - problem such as a simple quadratic.
For the ultra lazy people which may want to sacrifice control over their system for magic convenience and a 2-minute install-to-usage experience, we should support a meta-distribution with MILA's favorite and recommended components and ready-to-run settings. Such metaopt-recommended should be as simple as that:
1. Hard frozen dependencies on metaopt.core and metaopt.algo.freezethaw, metaopt.algo.bayesian_something (or whatever we decide - this is up for possible changes also, users should not be able to tell a damn just by running the cmd executable as usual). Everything frozen in the appropriate version.
2. Default configuration file (moptconfig.yaml for the initiated with default choice for the algorithms to be used and their respective settings) in user configuration site (~/.config/metaopt following XDG standard). Currently, software is already searching there for possible configurations (hierarchical stuff).
3. Testing the integration of this - recommended by the core developers - setting.
So borrowing from PyPA's glossary this meta-distribution should define a Known Good Set (KGS).

A set of distributions at specified versions which are compatible with each other. Typically a test suite will be run which passes all tests before a specific set of packages is declared a known good set. This term is commonly used by frameworks and toolkits which are comprised of multiple individual distributions.

So, user experience can be possibly something as simple as this:
1. pip install metaopt-recommended
2. mopt ./train.py --learning-rate~'loguniform(0.001, 1.0)'

What can be further raised about installation troubles? I cannot think anything that's not solved by the aforementioned schema, which can suit any user's flavor.

Final note on this section regarding fears about namespace packages: Dropping support for Python2 makes things native and problem-free, if someone just follows PyPA's instructions or PEP-420.

Community

About forking smaller repos: I believe that this actually helps and encourages possible contributors because it's easier to search and keep in track with what's happening in a well-confined space.

About issues all over the place: Issues will be actually organized naturally where they semantically belong.

Keeping standards and distribution conventions: A demo repository or subdirectory (like this in /tests/functional/gradient_descent_algo) can be provided to act as a starting point for implementing a new algorithm. This directory can be bundled with linter configuration settings of our preference (flake8 and pylint). And for the extremely paranoid (like me), we could also create a flake8 extension for forcing a repo's structure to have the specific /src/metaopt/algo/... structure exactly as suggested by PEP-420.

I'm gonna comment it from the point of view of building a community on top of this. I love this software and I'm surely gonna be using it for my research. I would segment the people using this software as 1) novice: researchers who just want to optimize their neural networks/functions, 2) advance users: mathematicians/computer scientists that would want to build their own hyperparameter optimization algorithm, 3) very advanced users: computer scientists who believe the world is a better place if sharing happens and they give their code for free to be available easily via pip install. My crude estimation is that the segmentation is 50%,40%, and 10% for novice, advanced and very advanced users. Although pretty small, the 10% might be the driver of the community, while the 40% would surely think of moving their repos to be available for everybody via pypi.

It's like having a Matlab-extension where instead of having a crude installation process, we'd only have pip install to install another type of package. Back then I wish Matlab had those, but it didn't. This also makes us differ from the hyperparameter optimization packages out there that don't extensions build on top of the main package, thus in my point of view ripe for commercial value for our lab.

Before I read the full thing and give my more complete thoughts, a couple of links to the Blocks history about namespace packages:

I also remember there being complaints about having to point to specific dev branches of blocks in the CI buildbot config files for blocks-examples or blocks-extras, for changes that needed to be synchronized between repos. I can't find them at the moment, I'll post them if I come across that.

Dependencies

Let's first clarify the problem of dependencies. I fetched 2 first points of @tsirif comment and I added a third one.

Hard to keep track if we do this for every implementation that the core team decides to develop and maintain
Version breakage when dependencies changes. Users should keep track and update dependencies if needed
User can have problem installing dependencies they do not need. Installing dependencies on clusters is not always trivial.

Problem 1. highly depends on how many algorithms we add in the core. Yes, this solution would not scale, but it is not necessarily a problem if we do not intend to scale. The core should be small, even if we decide to add a few meta-optimizer algorithms.

Problem 2. is not clear to me to be honest. If dependencies are defined in setup.py with scrict versions, then any upgrade on metaopt which needs an upgrade on a dependency would trigger the latter update automatically. Dependencies of the core are certainly not optional. Users don't need to keep track.

Problem 3. is to me the most serious one. Adding skopt would probably not be such a pain, but what if some genetic algorithm or gradient descent algorithm become so popular we decide to add it to core? Would their dependencies be easy to install as well? Worst case, we keep them outside, just like they would have been if the core was bare empty of algorithms.

I think the minimal dependency problem is not a serious one if the set of core algorithms if kept very small. There is not much algorithms out there which are good candidates for the core anyway.

Why keep the core bare empty of algorithms

Again, I fetched 4 points of @tsirif comment.

Minimal dependencies
Broadcasting problem and version problem when adding new algorithms
Interface coherence: Some algorithmns have more involved interface requirements (FreezeThaw example)
Keep things simple. User dont need everything (meta algorithm example)

Problem 1 is discussed above.

Problem 2 depends again on how many core algorithms we intend do add. It seems to me that once we are done with a few fundamental ones, adding new core algorithms will be a very rare event. I can hardly see how this should significantly determines our design choices.

Problem 3 is not clear to me. What would be the problem if FreezeThaw is part of the core and requires additional methods from its subalgorithms?

Problem 4 is a double-edged sword. If the core code is simpler or leaner, how is this simpler:

pip install metaopt
mopt ./train.py --learning-rate~'loguniform(0.001, 1.0)'
> oops, meta not installed.
pip install metaopt-meta
mopt ./train.py --learning-rate~'loguniform(0.001, 1.0)'

To "Keep it simple stupid", I would answer "Make Simple Tasks Simple". Keeping those meta algorithms outside of the core would be based on ideological reasons, I'd favor pragmatism.

Target users

I think we should keep in mind what are our target users when deciding which distribution scheme we pick for metaopt.

What are the alternatives? There is for example skopt, Vizier (google), Hyper-opt and spearmint. My impression is that the user distribution, at least in the lab, is something like this:

95% use manual/random/grid search 5% use some framework or custom meta optimizer

So, do we want to fight against other frameworks or do we want to get people out of their misery? :stuck_out_tongue_closed_eyes: More seriously, why would people not be using those frameworks? What are those frameworks failing at? From my understanding based on collaborations I had up to now, there is two main reasons:

Inertia. @tsirif call it lazyness, I think it's not quite accurate. Give users new clusters with >500 powerfull GPUs and they will still fight for getting a few GPUs on a duck taped slurm system of 100 GPUs. They would even spend time looking for fancy solutions to improve the scheduler system on the 100 GPUs setup rather than spending time setting up the new cluster.
Trust. Many people do not trust black box optimization for optimizing their important new algorithm.

From that there seems to me that there is two important features we need to provide.

Painless; should install simply, run simply, wrap simply, provide results simply.
Improvement assurances; should provide vizualizations which show how the meta-optimizer found efficiently a good set of hyper-parameters.

Feature 2 is out of the scope of this discussion, but feature 1 is tightly related to the installation process.

We should keep in mind that every single bump in the process of installing, setting up, configuring, running and evaluating experiments is an additional friction susceptible of causing inertia.

Indeed, "painless" is always something we are looking for when developing softwares so pointing that out could be a bit useless. What I want to stress out here is that a large portion of the users are highly affected by "friction" so we should move those "frictions" as much as possible to the small portion of resilient users. What is worst, adding a small friction for 95% of the users, or adding a mid-friction for 5% of them?

Sure, I took those proportion numbers out of a hat, so it's certainly far from being accurate. Maybe we should make a survey for the lab. What do you think?

Alternatives

To conclude, if we go with "Multiple Repositories, Multiple Distributions" there would be 3 solutions:

Bare core; Pain goes to large proportion of users
Bare core + install bundles; Pain goes to developers
Minimal core; Pain goes to small proportion of users

So, what do you think is the best? I'm afraid if we decide to swallow it and pick solution 2., it might affect development later on. Time is limited. Furthermore, the small proportion of resilient users is likely to be the (very) advanced users @dendisuhubdy talked about. Will they really be annoyed so much with slightly larger core dependencies? I doubt about it.

@bouthilx suggestion that we raise an error like this would be nice to have. I agree.

pip install metaopt
mopt ./train.py --learning-rate~'loguniform(0.001, 1.0)'
> oops, meta not installed.
pip install metaopt-meta
mopt ./train.py --learning-rate~'loguniform(0.001, 1.0)'

On @bouthilx's question: "More seriously, why would people not be using those frameworks?" I talked to Philemon Brakel about running Spearmint on the lab machines and he said it's a pain in the b*tt to install and configure, one that we don't have. I don't necessarily think that "Many people do not trust black box optimization for optimizing their important new algorithm." but because they have never had a choice to try a new hyperparameter optimization algorithm or using it is just another pain in the b*tt.

Another reference about one core and multiple submodules -> not good, see a discussion here Why your company shouldn’t use Git submodules.

As @bouthilx's consideration, I also believe that choice 3 (a minimal core) is good, with a bet that the small proportion of users are (very) advanced users, say 5 of us right now and maybe 2 other people in the lab that aren't that afraid of a little bit engineering. To ease the process @tsirif might as well give a tutorial to the lab on how to develop algorithms on top of Metaopt for the whole lab, and probably have the next deep learning meetup in Montreal talking about that.

Trying to answer @bouthilx 's concerns...

Dependencies

I think that keeping core small is important, so it would not hurt distributing with it algorithms that share dependencies with it but do not bring in their own. So for example, I would not ship skopt_bayesian with core, just because skopt_bayesian needs to have a dependency on skopt. But let's for a moment consider the case that skopt_bayesian is distributed with metaopt.core, then one of 2 possible cases must hold: A. skopt is a hard dependency on metaopt.core; OR B. skopt is an optional dependency on metaopt.core.

Case (A) is a big NONO for me; namely the primary reason be that other implementations of the same algorithm would have dependencies on other stuff and metaopt.core should not show preference on which to use. This is because, it being used is not necessary for framework to be optional + its functionality is not something that we ourselves develop. Another reason would be that it's not metaopt.core 's job, but that would mainly be (forerunning the discussion a bit) due to me preferring object-first designs in all levels of development, i.e. ideology.

On the other hand, case (B) is case that causes hell. First of all, by using it allows that there are two possible ways of an algorithm having dependencies: 1. optional of metaopt.core, if is it is to be shipped within it; 2. hard of an extension package; if an extension is to be (and it will eventually, because this can happen - and it should be able to happen) distributed separately. Trying to clarify problem 2 in the list that you made, I would give the following example. Suppose that there is a BaseAlgorithm implementation, named Algo1 that depends on an external package, named 'X'. The only component of metaopt.core that necessarily uses 'X' is Algo1. However, the usage of Algo1 is only a possible case, but not a necessary. Imagine now that someone, who trusts in us, uses regularly this Algo1. However for reasons perhaps external to us, this dependency 'X' changes to a package named 'Y'. Then, the user in the next update of metaopt.core would have the necessary dependency of the possible component Algo1 changed, so when he runs it at the first time, one gets an error fo unsatisfied dependency. To resolve this, one should check what this change was, install the changed dependency 'Y' and uninstall the old dependency 'X' (assuming that one keeps track of what has been installed).

Problem 3's statement of possibility of adding popular algorithms to the core contradicts standard stated in Problem 1, trying to keep metaopt.core minimal.

TL;DR. My general rule of the thumb here, as I discussed in my second comment, is that metaopt.core should be allowed to ship algorithms that do not necessitate dependencies destinated only for their own exclusive usage. Put another way, any algorithm (if any- except Random) distributed with metaopt.core must have a dependency set which is a subset of core metaopt.core's dependencies.

TL;DR2. A possible algorithm, should be distributed as an extension, as it is not guaranteed to be always the execution case.

Why keep the core bare empty of algorithms

Keeping TL;DR2 in mind, FreezeThaw imposes an extra interface requirement which is only useful though for-itself. By-itself the interface is unnecessary for core's execution and hence it must be distributed separately. Otherwise, it can be the case that it is just bloat code, burdening core by extra maintaining for its contigent actuality (execution).

Example in problem 4 is exactly the case of the user having complete control of what they need vs what they have in their system. A modular system should be distributed in a modular way too. Where someone believes that this should not be the case, then one must agree that distributing everything into one large and bloated bundle is perfectly fine. In my opinion, the latter is not fine at all for non-tautological reasons which lie in development culture and dependency assignment. Having said all of these, meta-algorithm implementations was an example to highlight that possible cases should be distributed separately. However, them being a product of developing team itself and having none other dependencies than those of metaopt.core would make them fit to be distributed with metaopt.core. If it was completely in my hand though, I would also distribute them separately.

Target users

That's why I proposed the scheme with the meta-distribution having as hard-dependencies what we the developers personally also prefer to do our jobs with. I discussed this schema in note Installation/4 and it has the merits of both solving the "inertia" problem and being compatible with the "Multiple Repos, Multiple Distributions" scheme.

I agree completely on the survey test. I think that what I proposed "MR,MD" + a meta-distribution covers all flavors of target users.

Alternatives

If something is to be trusted and ensured smooth experience for the largest portion of the users, it needs to be tested for integration in particular. Hence, a place for these tests is needed. That repo, which serves as integration test repo for the combination of implementation that we, 'the developers', choose to support 100%, can serve as a meta-distribution if provided also with a default configuration file also. Which seems a natural and simple thing to maintain. Exactly what anyone would do for their favourite combination of metaopt algorithms: Using them frequently and having a default (convenience) configuration file for them. [They would put it in a default directory, too, if they wanted to take advantage of the hierarchical configuration].

@lamblin Regarding namespace packages, there is an extensive testing procedure for the various implementation methods proposed ({native, pkgutil, pkg_resources}) in this repo. The repo also contains sample usages.

Also, this table shows what it is supported and what's not.

[...] metaopt.core should not show preference on which [implementation] to use.

I don't see why. It can simply be "first come first served". If we use skopt for the first implementation, then that's the one in the core. Suppose we go for empty core and make bundles for easier installs, should we include all possible implementations available to be unbiased? It does not make sense to me. I could see a problem in favoring one over another if some algorithm definition was not stable, leading to different implementations having significant impact on their behavior, but that would be a serious reason not to include that algorithm in the core anyway.

I don't agree case (A) is a big NONO. It's a big "think about it well before including it" however.

I agree case (B) (optional dependecy on metaopt.core) would be pretty bad.

Problem 3's statement of possibility of adding popular algorithms to the core contradicts standard stated in Problem 1, trying to keep metaopt.core minimal.

It is not true unless one holds an extremist vision of "minimal", meaning empty of any algorithms. Minimal does not mean devoid of any additional dependencies, it means as less as possible.

FreezeThaw imposes an extra interface requirement which is only useful though for-itself

Yes, and this added interface would be in agreement with another sub-algorithm only. Nothing in the core would be affected by it. Furthermore, if we decide this algorithm is important enough to be maintained by our developers, we'll have to maintain it, be it in the core or in a separate package.

A modular system should be distributed in a modular way too. Where someone believes that this should not be the case, then one must agree that distributing everything into one large and bloated bundle is perfectly fine

This is not nuanced enough. Suppose we decide to ship in a modular way. How do you define what is a module and what is not. Modules could be aggregation of algorithms sharing dependencies. Modules could be aggregation of algorithms sharing commun behaviors. Modules could be single algorithms. Modules could be part of algorithms which are likely to be reused (exagerating a bit :stuck_out_tongue_closed_eyes: ).

Having a slightly larger minimal functional core with modules around is not a non modular distribution.

That's why I proposed the scheme with the meta-distribution having as hard-dependencies what we the developers personally also prefer to do our jobs with

Yes, in case you did not understand that was the point 2. in the Alternatives section.

I see your point about integration tests. You can agree however that an integration test on a dummy bare empty repo is something else than an integration test on a repo containing several algorithm implementations. Modifications on the core that could affect the integration test are less likely to cause trouble on the dummy repo. Anyhow, we could handle that. As I said, that's option 2, pain goes to the developer, users are safe.

My main concern with the option 2, and I think Pascal and Fred will agree with me on this, is that burdens on the developers should not be taken too lightly. We have very limited time, very limited number of developers, so we should make sure we get the best out of it. A super modular distribution is not worth much if the extra burden it causes prevents us from delivering critical features in reasonable time (especially if the alternative for the super modular distribution would be quite modular anyway). I'm not saying it will, but it worries me.

@tsirif Suppose we go with "MR,MD", could you make a package with a standalone dummy algorithm to demonstrate on Thursday?

[...] metaopt.core should not show preference on which [implementation] to use.

I don't see why. It can simply be "first come first served". If we use skopt for the first implementation, then that's the one in the core. Suppose we go for empty core and make bundles for easier installs, should we include all possible implementations available to be unbiased? It does not make sense to me. I could see a problem in favoring one over another if some algorithm definition was not stable, leading to different implementations having significant impact on their behavior, but that would be a serious reason not to include that algorithm in the core anyway.

Well, I suppose the case is that I would really like metaopt.core to be really a core, and not metaopt. I would not like to bring something, that does not depend only possibly on numpy or scipy, inside metaopt.core. I do not expect instabilities in skopt, my only problem is that it's not a necessary solution.

One could say: Then, why do you have a hard dependency on mongodb? Well, it's not databases what I try to sell. I want to show off the versatility on possible algorithms, that's what core enables. In my mind, metaopt is a distributed software ecosystem.

Problem 3's statement of possibility of adding popular algorithms to the core contradicts standard stated in Problem 1, trying to keep metaopt.core minimal.

It is not true unless one holds an extremist vision of "minimal", meaning empty of any algorithms. Minimal does not mean devoid of any additional dependencies, it means as less as possible.

If as less as possible can mean None, I am willing to have None :stuck_out_tongue:.

A modular system should be distributed in a modular way too. Where someone believes that this should not be the case, then one must agree that distributing everything into one large and bloated bundle is perfectly fine

This is not nuanced enough. Suppose we decide to ship in a modular way. How do you define what is a module and what is not.

Modules could be aggregation of algorithms sharing dependencies.

:v:

Modules could be aggregation of algorithms sharing common behaviors.

:v:

Modules could be single algorithms.

:v:

Modules could be part of algorithms which are likely to be reused (exagerating a bit :stuck_out_tongue_closed_eyes: ).

:-1: :stuck_out_tongue: Software that have this property is actually software which share dependencies. I believe such pieces should be distributed together.

Having a slightly larger minimal functional core with modules around is not a non modular distribution.

You are right. However, I am still hesitating to bundle code that it is not necessary in the core. (((Imagine that I would like to have metaopt.client as a separate distribution as well, but I do not dare to discuss this now. Behave like you never saw these two sentences.)))

That's why I proposed the scheme with the meta-distribution having as hard-dependencies what we the developers personally also prefer to do our jobs with

Yes, in case you did not understand that was the point 2. in the Alternatives section.

Damn, I got confused about what you meant by bundles.

I see your point about integration tests. You can agree however that an integration test on a dummy bare empty repo is something else

That's an interface test. There are stuff like that in metaopt.core 's functional tests.

than an integration test on a repo containing several algorithm implementations.

That's an integration test. Actual implementations distributed separately will be brought together to be tested on that separate repo.

Modifications on the core that could affect the integration test are less likely to cause trouble on the dummy repo. Anyhow, we could handle that. As I said, that's option 2, pain goes to the developer, users are safe.

My main concern with the option 2, and I think Pascal and Fred will agree with me on this, is that burdens on the developers should not be taken too lightly. We have very limited time, very limited number of developers, so we should make sure we get the best out of it. A super modular distribution is not worth much if the extra burden it causes prevents us from delivering critical features in reasonable time (especially if the alternative for the super modular distribution would be quite modular anyway). I'm not saying it will, but it worries me.

I see the separate distributions as a delegation of responsibility on management level also. A way to organize issue solving and development better.

@tsirif Suppose we go with "MR,MD", could you make a package with a standalone dummy algorithm to demonstrate on Thursday?

There is already one.

tests/functional/gradient_descent_algo is a subdirectory which contains a Python example repo distributing a very simple and predictable (for testing) algorithm, gradient descent. The black_box.py in tests/functional/demo is a process evaluating information from a 1D-quadratic. Tox installs it as a part of the testenv and devel command [why so and not as a dependency of these environments, is a pip issue which is not mine or tox' job - I can supply a link with the discussion on the problem and people complaining that pip "sucks"]. tests/functional/demo/test_demo.py::test_demo forks to execute mopt (functional test) with the correct configuration to instantiate that separately installed algorithm. Afterwards I check in the database for this execution's results. tests/functional/demo/test_demo.py::test_workon tests the same thing let's say, but in a unittest way by skipping cli.py and resolve_config.py to create an appropriate Experiment object on its own and call worker's workon(experiment) function immediately. It yields the same results.

OK, finally caught up with the discussion.

Before I reply with more details, here are a couple of concerns I prefer to make explicit.

Continuous integration (CI) running on PRs.

This is extremely valuable to detect issues before they are actually introduced, rather than having to wait for the thing to be merged before realizing it broke something, or reveals something had already been broken elsewhere.

This is where having different Git repos is a pain:

where do you run the tests? If you run each extention's tests in its repo, then the PRs to the core will not run those tests.
So you could also run them in the core. But if a test fails because of something that needs to be updated in the extension, how can you verify that the extension fix actually fixes the issue, since you still have the old extension code?
Each time you have PRs that depend on each other, it is a pain, because all scripts point to the master branch (or a release) of the other repo, not the corresponding PR.

metaopt-recommended

Maintaining that would also be a pain, I think as much as managing PRs to a single repo, especially for the issue of synchronizing bug fixes or interface changes across repos.

Misc remarks

I think we differ on what the likely cases are vs. unlikely, what will be happening often vs. exceptional cases. For instance, I think that extension with cumbersome external dependencies (like skopt) will be rare and can be managed on a case-by-case, some being integrated in metaopt.algo (or metaopt.contrib.algo or metaopt.algo.contrib), some in different repos (which I think should be exceptional).

contrib could be a separate tree of metaopt, with for instance metaopt.contrib.algo and metaopt.contrib.somethingelse; or we could have contrib sub-repos at each level where needed, for instance metaopt.core.contrib and metaopt.algo.contrib or even metaopt.algo.skopt.contrib. I'm not sure what the best would be.

If an extension "Algo1" changes from dependency X to dependency Y, then I would expect the error message to say something like "Algo1 now requires Y instead of X. Run pip install -e file:.#egg=metaopt[Algo1] (or the appropriate setup.py invocation) or install it manually. Btw you can get rid of X if you don't need it". If the user updates the package from PyPI, it should be automatically handled anyway.

Setting up a setup.py for an additional Python submodule is not something ML developers are familiar with. Most of them would much rather fork the whole thing, add their project inside, and ask their collaborators to clone their fork. We might as well accept PRs.

In any case, let's discuss that tomorrow.

Additional points raised in IRL meeting:

We would like to have a low barrier of entry for people wanting to try to implement new algorithms, who are academics and not software developers.
Exceptional cases where we would want a separate repos would be wrappers for different versions of the same external library or framework (we would not want to pin the requirement in the core, and do not want to have duplicated code for different versions)
We want to build a community, and give trust that we are not going to break their extensions silently (by having CI test extensions, at least minimally).
It's easier to enforce coding standards and organization when people want to contribute back, rather than scare them away by forcing that when they want to try starting experimenting with a high-risk research idea.

Epistimio / orion

Debate: How to distribute our software and its possible extensions #38

Terminology

Links

User and Developer requirements

Current state of affairs (14/02/2018)

Links

Proposals

Multiple Repositories, Multiple Distributions

Multiple Repositories, Single Distribution

Single Repository, Single Distribution

Links

A) There is two different kinds of contributions:

B) Problems arise from the fact that both type of contributions have different requirements

C) We have so far three possible solutions (enumerated in issues' description).

D) Points in favor or against the different solutions

E) Points in favor or against the different solutions assuming we do not give any support to contributed algorithms

My opinion

Multiple Repositories, Multiple Distributions

Dependencies

Packages

Development

Installation

Community

Dependencies

Why keep the core bare empty of algorithms

Target users

Alternatives

Dependencies

Why keep the core bare empty of algorithms

Target users

Alternatives

Continuous integration (CI) running on PRs.

metaopt-recommended

Misc remarks