NVIDIA / fsi-samples

A collection of open-source GPU accelerated Python tools and examples for quantitative analyst tasks and leverages RAPIDS AI project, Numba, cuDF, and Dask.
271 stars 115 forks source link

[REVIEW] Simple external plugin example #113

Closed yidong72 closed 3 years ago

yidong72 commented 3 years ago

This is a simple external plugin exmaple that uses 'entry pointto discover the plugins. Check theREADME.md` file to install and play with it. It only depends on Pandas and Numpy, which is very fast to install and try.

avolkov1 commented 3 years ago

I had to change a few things in the "docker/build.sh" script to successfully build a docker container. The build script I used is attached. You don't need to use it verbatim as some of the changes are just different formatting. The important changes are:

  1. Rapids version:

    -RAPIDS_VERSION="0.14.1"
    +RAPIDS_VERSION="0.17.0"
  2. You suggested using 2.0 version for jupyterlab-manager in the readme.

    -RUN jupyter labextension install @jupyter-widgets/jupyterlab-manager --no-build  
    +RUN jupyter labextension install @jupyter-widgets/jupyterlab-manager@2.0 --no-build
  3. Use dask labextension below 5.0.0 because the latest one requires jupyterlab 3.

    -RUN pip install dask_labextension
    +RUN pip install "dask_labextension<5.0.0"
  4. In the nemo patch specify a higher version of numba which seems to work fine. Otherwise it wants to downgrade llvmlite and then the build fails.

    -+numba==0.49.1
    ++numba<=0.52.0

I used option to build with Ubuntu 20.04 and CUDA 11.0.

Edit: Removed my attached build script to avoid confusion with listed changes.

I had to make one more change to make this work.

  1. Update librosa library which is part of nemo install dependency.
    - librosa<=0.7.2
    +-librosa<=0.7.2
    ++librosa<=0.8.0
yidong72 commented 3 years ago

I updated the build.sh

avolkov1 commented 3 years ago

One more issue with 05_customize_nodes_with_ports.ipynb is that rmm.device_array API has been removed. Could you update that with cuda.device_array where cuda is imported from numba (already imported in the notebook). That snippet should look like this: (class NumbaDistanceNode)

    def process(self, inputs):
        df = inputs['points_df_in']
        number_of_threads = 16
        number_of_blocks = ((len(df) - 1) // number_of_threads) + 1
        # Inits device array by setting 0 for each index.
        # df['distance_numba'] = 0.0
        darr = cuda.device_array(len(df))
        distance_kernel[(number_of_blocks,), (number_of_threads,)](
            df['x'],
            df['y'],
            darr,
            len(df))
        df['distance_numba'] = darr
        return {'distance_df': df}

Also update with cuda.device_array in notebooks/custom_port_nodes.py.

Thanks.