Ability to run Arboreto using a multiprocessing pool in place of Dask

The ability to run Arboreto across multiple nodes in Dask is extremely powerful, but the implementation has caused lots of issues for me (and others, it seems). In a lot of cases, I had massive issues with the Dask client -- it would sometimes seem to go on computing for days, or just quit halfway through a run with a cryptic error.

In practice, I have only ever used a single node to run GRNBoost2, and it's still quite fast, even for 10s to 100s of thousands of cells. Therefore, I thought this multiprocessing implementation might be useful. I've been using it extensively, and it's quite reliable. In many cases, the compute time is actually slightly shorter when using multiprocessing (perhaps due to some Dask overhead?).

Summary of changes:

Setting the client_or_address parameter to 'multiprocessing' in either of the grnboost2 or genie3 functions will run these algorithms using a multiprocessing pool. The number of workers is specified with the multiprocessing_workers parameter.
- parameters added to grnboost2, genie3, and diy functions
- logic added to select Dask or multiprocessing in diy function
- added run_arboreto_mp to do the work of setting up a multiprocessing pool and calculate links for each target gene separately
Replaced pd DataFrame as_matrix with to_numpy (minor fix)

As a check, the multiprocessing implementation produces the same results as when using Dask, using a fixed seed:

# test data:
wget https://raw.githubusercontent.com/aertslab/SCENICprotocol/master/example/allTFs_hg38.txt
wget https://raw.githubusercontent.com/aertslab/SCENICprotocol/master/example/expr_mat.loom

pip install --force-reinstall git+https://github.com/cflerin/arboreto@multiprocessing

# python:
from arboreto.utils import load_tf_names
from arboreto.algo import grnboost2
import loompy as lp
import pandas as pd

lf = lp.connect('expr_mat.loom', mode='r', validate=False )
ex_matrix = pd.DataFrame( lf[:,:], index=lf.ra.Gene, columns=lf.ca.CellID ).T
lf.close()

tf_names = load_tf_names('allTFs_hg38.txt')

# Dask test:
network = grnboost2(expression_data=ex_matrix,
                    tf_names=tf_names,
                    seed=777,
                    verbose=True)

>>> network.head()
        TF  target  importance
27   RPS4X   RPL30   57.537719
665   SPI1    CSTA   55.521603
27   RPS4X  EEF1A1   54.931686
27   RPS4X   RPS14   53.646867
692  RPL35    RPL3   52.932191

# multiprocessing test:
networkMP = grnboost2(expression_data=ex_matrix,
                    tf_names=tf_names,
                    client_or_address='multiprocessing',
                    multiprocessing_workers=7,
                    seed=777,
                    verbose=True)

>>> networkMP.head()
        TF  target  importance
27   RPS4X   RPL30   57.537719
665   SPI1    CSTA   55.521603
27   RPS4X  EEF1A1   54.931686
27   RPS4X   RPS14   53.646867
692  RPL35    RPL3   52.932191

aertslab / arboreto

Ability to run Arboreto using a multiprocessing pool in place of Dask #21