rapidsai / cudf

cuDF - GPU DataFrame Library
https://docs.rapids.ai/api/cudf/stable/
Apache License 2.0
8.29k stars 884 forks source link

[BUG] DataFrame.as_gpu_matrix() is slower than DataFrame.as_gpu_matrix() with transform() #2454

Closed kayush2O6 closed 5 years ago

kayush2O6 commented 5 years ago

Describe the bug It has been observed that DataFrame.as_gpu_matrix(order='C') api is slower than DataFrame.as_gpu_matrix() and then transform() twice. However, both of them yield the same array.

Steps/Code to reproduce bug

import cudf 
import numpy as np
import time

n = 1000000
m = 50

gdf = cudf.DataFrame()
for i in range(m):
    gdf[i] = np.random.random_sample(n)

for c in gdf.columns:
    gdf[c]=gdf[c].astype(np.float32)

st = time.time()
gmat = gdf.as_gpu_matrix(order='C')
print("time taken", time.time()-st)
time taken 247.8072156906128
print(gmat.flags)
{'F_CONTIGUOUS': False, 'C_CONTIGUOUS': True}

st = time.time()
mat = gdf.as_gpu_matrix()
tmat = mat.transpose().transpose()
print("time taken", time.time()-st)
time taken 0.520819902420044
print(tmat.flags)
{'F_CONTIGUOUS': False, 'C_CONTIGUOUS': True}

#check the values
np.array_equal(gmat.copy_to_host(), tmat.copy_to_host())
True

Environment details

<details><summary>Click here to see environment details</summary><pre>

     **git***
     Not inside a git repository

     ***OS Information***
     DISTRIB_ID=Ubuntu
     DISTRIB_RELEASE=16.04
     DISTRIB_CODENAME=xenial
     DISTRIB_DESCRIPTION="Ubuntu 16.04.6 LTS"
     NAME="Ubuntu"
     VERSION="16.04.6 LTS (Xenial Xerus)"
     ID=ubuntu
     ID_LIKE=debian
     PRETTY_NAME="Ubuntu 16.04.6 LTS"
     VERSION_ID="16.04"
     HOME_URL="http://www.ubuntu.com/"
     SUPPORT_URL="http://help.ubuntu.com/"
     BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
     VERSION_CODENAME=xenial
     UBUNTU_CODENAME=xenial
     Linux a293a33e114b 4.4.0-154-generic #181-Ubuntu SMP Tue Jun 25 05:29:03 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

     ***GPU Information***
     Fri Aug  2 03:45:22 2019
     +-----------------------------------------------------------------------------+
     | NVIDIA-SMI 418.67       Driver Version: 418.67       CUDA Version: 10.1     |
     |-------------------------------+----------------------+----------------------+
     | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
     | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
     |===============================+======================+======================|
     |   0  Tesla P100-SXM2...  On   | 00000000:05:00.0 Off |                    0 |
     | N/A   36C    P0    43W / 300W |   2507MiB / 16280MiB |      0%      Default |
     +-------------------------------+----------------------+----------------------+
     |   1  Tesla P100-SXM2...  On   | 00000000:06:00.0 Off |                    0 |
     | N/A   33C    P0    44W / 300W |    293MiB / 16280MiB |      0%      Default |
     +-------------------------------+----------------------+----------------------+
     |   2  Tesla P100-SXM2...  On   | 00000000:84:00.0 Off |                    0 |
     | N/A   28C    P0    35W / 300W |     10MiB / 16280MiB |      0%      Default |
     +-------------------------------+----------------------+----------------------+
     |   3  Tesla P100-SXM2...  On   | 00000000:85:00.0 Off |                    0 |
     | N/A   35C    P0    44W / 300W |   5890MiB / 16280MiB |      0%      Default |
     +-------------------------------+----------------------+----------------------+

     +-----------------------------------------------------------------------------+
     | Processes:                                                       GPU Memory |
     |  GPU       PID   Type   Process name                             Usage      |
     |=============================================================================|
     +-----------------------------------------------------------------------------+

     ***CPU***
     Architecture:          x86_64
     CPU op-mode(s):        32-bit, 64-bit
     Byte Order:            Little Endian
     CPU(s):                56
     On-line CPU(s) list:   0-55
     Thread(s) per core:    2
     Core(s) per socket:    14
     Socket(s):             2
     NUMA node(s):          2
     Vendor ID:             GenuineIntel
     CPU family:            6
     Model:                 79
     Model name:            Intel(R) Xeon(R) CPU E5-2650L v4 @ 1.70GHz
     Stepping:              1
     CPU MHz:               1999.957
     CPU max MHz:           2500.0000
     CPU min MHz:           1200.0000
     BogoMIPS:              3402.97
     Virtualization:        VT-x
     L1d cache:             32K
     L1i cache:             32K
     L2 cache:              256K
     L3 cache:              35840K
     NUMA node0 CPU(s):     0-13,28-41
     NUMA node1 CPU(s):     14-27,42-55
     Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb invpcid_single intel_pt ssbd ibrs ibpb stibp kaiser tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdseed adx smap xsaveopt cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts md_clear flush_l1d

     ***CMake***
     /conda/bin/cmake
     cmake version 3.14.5

     CMake suite maintained and supported by Kitware (kitware.com/cmake).

     ***g++***
     /usr/bin/g++
     g++ (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609
     Copyright (C) 2015 Free Software Foundation, Inc.
     This is free software; see the source for copying conditions.  There is NO
     warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

     ***nvcc***
     /usr/local/cuda/bin/nvcc
     nvcc: NVIDIA (R) Cuda compiler driver
     Copyright (c) 2005-2018 NVIDIA Corporation
     Built on Sat_Aug_25_21:08:01_CDT_2018
     Cuda compilation tools, release 10.0, V10.0.130

     ***Python***
     /conda/bin/python
     Python 3.7.3

     ***Environment Variables***
     PATH                            : /conda/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
     LD_LIBRARY_PATH                 : :/usr/local/cuda/lib64:/usr/local/lib:/conda/lib:/usr/local/cuda/lib64:/usr/local/lib
     NUMBAPRO_NVVM                   : /usr/local/cuda/nvvm/lib64/libnvvm.so
     NUMBAPRO_LIBDEVICE              : /usr/local/cuda/nvvm/libdevice/
     CONDA_PREFIX                    :
     PYTHON_PATH                     :

     ***conda packages***
     /conda/bin/conda
     # packages in environment at /conda:
     #
     # Name                    Version                   Build  Channel
     _libgcc_mutex             0.1                        main
     absl-py                   0.7.1                    pypi_0    pypi
     arrow-cpp                 0.12.1           py37h0e61e49_0    conda-forge
     asn1crypto                0.24.0                   py37_0
     astor                     0.8.0                    pypi_0    pypi
     atomicwrites              1.3.0                      py_0    conda-forge
     attrs                     19.1.0                     py_0    conda-forge
     backcall                  0.1.0                      py_0    conda-forge
     blas                      1.0                         mkl
     bleach                    3.1.0                      py_0    conda-forge
     bokeh                     1.2.0                    py37_0
     boost                     1.68.0          py37h8619c78_1001    conda-forge
     boost-cpp                 1.68.0            h11c811c_1000    conda-forge
     boto                      2.49.0                   pypi_0    pypi
     boto3                     1.9.162                    py_0
     botocore                  1.12.163                   py_0
     bzip2                     1.0.7                h7b6447c_0
     ca-certificates           2019.5.15                     0
     certifi                   2019.6.16                py37_0
     cffi                      1.12.3           py37h8022711_0    conda-forge
     chardet                   3.0.4                    py37_1
     clangdev                  8.0.0                hc9558a2_2    conda-forge
     click                     7.0                      pypi_0    pypi
     cloudpickle               1.2.1                      py_0    conda-forge
     cmake                     3.14.5               hf94ab9c_0    conda-forge
     cmake-setuptools          0.1.3                    pypi_0    pypi
     conda                     4.7.10                   py37_0
     conda-package-handling    1.3.11                   py37_0
     cryptography              2.6.1            py37h1ba5d50_0
     cudatoolkit               10.0.130                      0
     cudf                      0.8.0                    pypi_0    pypi
     cuml                      0.8.0                    pypi_0    pypi
     curl                      7.62.0               hbc83047_0
     cycler                    0.10.0                   pypi_0    pypi
     cython                    0.29.12          py37he1b5a44_0    conda-forge
     cytoolz                   0.10.0           py37h516909a_0    conda-forge
     dask-core                 2.1.0                      py_0    conda-forge
     dask-cudf                 0.0.0.dev0               pypi_0    pypi
     dask-cuml                 0.8.0                    pypi_0    pypi
     decorator                 4.4.0                      py_0    conda-forge
     defusedxml                0.5.0                      py_1    conda-forge
     distributed               2.1.0                      py_0    conda-forge
     docutils                  0.14                     py37_0
     entrypoints               0.3                   py37_1000    conda-forge
     expat                     2.2.5             he1b5a44_1003    conda-forge
     freetype                  2.9.1                h8a8886c_1
     gast                      0.2.2                    pypi_0    pypi
     gensim                    3.8.0                    pypi_0    pypi
     google-pasta              0.1.7                    pypi_0    pypi
     grpcio                    1.22.0                   pypi_0    pypi
     h5py                      2.9.0                    pypi_0    pypi
     heapdict                  1.0.0                 py37_1000    conda-forge
     icu                       58.2              hf484d3e_1000    conda-forge
     idna                      2.8                      py37_0
     importlib_metadata        0.18                     py37_0    conda-forge
     intel-openmp              2019.4                      243
     ipykernel                 5.1.1            py37h24bf2e0_0    conda-forge
     ipython                   7.6.1            py37h5ca1d4c_0    conda-forge
     ipython_genutils          0.2.0                      py_1    conda-forge
     jedi                      0.14.1                   py37_0    conda-forge
     jinja2                    2.10.1                     py_0    conda-forge
     jmespath                  0.9.4                      py_0
     joblib                    0.13.2                   py37_0
     jpeg                      9b                   h024ee3a_2
     jsonschema                3.0.1                    py37_0    conda-forge
     jupyter_client            5.3.1                      py_0    conda-forge
     jupyter_core              4.4.0                      py_0    conda-forge
     jupyterlab                0.35.5           py37hf63ae98_0
     jupyterlab_server         0.2.0                    py37_0
     keras-applications        1.0.8                    pypi_0    pypi
     keras-preprocessing       1.1.0                    pypi_0    pypi
     kiwisolver                1.1.0                    pypi_0    pypi
     libarchive                3.3.3                h5d8350f_5
     libcurl                   7.62.0               h20c2e04_0
     libedit                   3.1.20181209         hc058e9b_0
     libffi                    3.2.1                hd88cf55_4
     libgcc-ng                 8.2.0                hdf63c60_1
     libgfortran-ng            7.3.0                hdf63c60_0
     libpng                    1.6.37               hbc83047_0
     libprotobuf               3.6.1             hdbcaa40_1001    conda-forge
     libsodium                 1.0.17               h516909a_0    conda-forge
     libssh2                   1.8.2                h22169c7_2    conda-forge
     libstdcxx-ng              8.2.0                hdf63c60_1
     libtiff                   4.0.10               h2733197_2
     libuv                     1.30.1               h516909a_0    conda-forge
     libxml2                   2.9.9                hea5a465_1
     llvmlite                  0.29.0           py37hf484d3e_0    numba
     lz4-c                     1.8.1.2              h14c3975_0
     lzo                       2.10                 h49e0be7_2
     markdown                  3.1.1                    pypi_0    pypi
     markupsafe                1.1.1            py37h14c3975_0    conda-forge
     matplotlib                3.1.1                    pypi_0    pypi
     mistune                   0.8.4           py37h14c3975_1000    conda-forge
     mkl                       2019.4                      243
     mkl-service               2.0.2            py37h7b6447c_0
     mkl_fft                   1.0.13           py37h516909a_1    conda-forge
     mkl_random                1.0.4            py37hf2d7682_0    conda-forge
     more-itertools            7.1.0                      py_0    conda-forge
     msgpack-python            0.6.1            py37h6bb024c_0    conda-forge
     nbconvert                 5.5.0                      py_0    conda-forge
     nbformat                  4.4.0                      py_1    conda-forge
     ncurses                   6.1                  he6710b0_1
     ninja                     1.9.0            py37hfd86e86_0
     notebook                  5.7.8                    py37_1    conda-forge
     numba                     0.43.0                   pypi_0    pypi
     numpy                     1.16.4           py37h7e9f1db_0
     numpy-base                1.16.4           py37hde5b4d6_0
     nvgraph                   0.1.0.dev0           cuda10.0_9    nvidia/label/cuda10.0
     nvstrings-cuda100         0.0.0.dev0               pypi_0    pypi
     olefile                   0.46                     py37_0
     openssl                   1.1.1c               h7b6447c_1
     opt-einsum                2.3.2                    pypi_0    pypi
     packaging                 19.0                       py_0    conda-forge
     pandas                    0.23.4          py37h637b7d7_1000    conda-forge
     pandoc                    2.7.3                         0    conda-forge
     pandocfilters             1.4.2                      py_1    conda-forge
     parquet-cpp               1.5.1                         4    conda-forge
     parso                     0.5.0                      py_0    conda-forge
     pexpect                   4.7.0                    py37_0    conda-forge
     pickleshare               0.7.5                 py37_1000    conda-forge
     pillow                    6.0.0            py37h34e0f95_0
     pip                       19.0.3                   py37_0
     pluggy                    0.12.0                     py_0    conda-forge
     prometheus_client         0.7.1                      py_0    conda-forge
     prompt_toolkit            2.0.9                      py_0    conda-forge
     protobuf                  3.9.0                    pypi_0    pypi
     psutil                    5.6.3            py37h516909a_0    conda-forge
     ptyprocess                0.6.0                   py_1001    conda-forge
     py                        1.8.0                      py_0    conda-forge
     pyarrow                   0.12.1           py37hbbcf98d_0    conda-forge
     pycosat                   0.6.3            py37h14c3975_0
     pycparser                 2.19                     py37_0
     pygments                  2.4.2                      py_0    conda-forge
     pyopenssl                 19.0.0                   py37_0
     pyparsing                 2.4.0                      py_0    conda-forge
     pyrsistent                0.15.3           py37h516909a_0    conda-forge
     pysocks                   1.6.8                    py37_0
     pytest                    5.0.1                    py37_0    conda-forge
     python                    3.7.3                h0371630_0
     python-dateutil           2.8.0                      py_0    conda-forge
     python-libarchive-c       2.8                     py37_10
     pytorch                   1.1.0           py3.7_cuda10.0.130_cudnn7.5.1_0    pytorch
     pytz                      2019.1                     py_0    conda-forge
     pyyaml                    5.1.1            py37h516909a_0    conda-forge
     pyzmq                     18.0.2           py37hc4ba49a_1    conda-forge
     readline                  7.0                  h7b6447c_5
     requests                  2.21.0                   py37_0
     rhash                     1.3.6             h14c3975_1001    conda-forge
     ruamel_yaml               0.15.46          py37h14c3975_0
     s3fs                      0.2.1                      py_0
     s3transfer                0.2.0                    py37_0
     scikit-learn              0.21.2           py37hd81dba3_0
     scipy                     1.2.1            py37h7c811a0_0
     send2trash                1.5.0                      py_0    conda-forge
     setuptools                41.0.1                   py37_0
     six                       1.12.0                   py37_0
     smart-open                1.8.4                    pypi_0    pypi
     snakeviz                  2.0.1                    pypi_0    pypi
     sortedcontainers          2.1.0                      py_0    conda-forge
     sqlite                    3.27.2               h7b6447c_0
     tb-nightly                1.15.0a20190714          pypi_0    pypi
     tblib                     1.4.0                      py_0    conda-forge
     termcolor                 1.1.0                    pypi_0    pypi
     terminado                 0.8.2                    py37_0    conda-forge
     testpath                  0.4.2                   py_1001    conda-forge
     tf-estimator-nightly      1.14.0.dev2019071001          pypi_0    pypi
     tf-nightly                1.15.0.dev20190715          pypi_0    pypi
     tf-nightly-gpu            1.15.0.dev20190715          pypi_0    pypi
     thrift-cpp                0.12.0            h0a07b25_1002    conda-forge
     tk                        8.6.8                hbc83047_0
     toolz                     0.10.0                     py_0    conda-forge
     torchvision               0.3.0           py37_cu10.0.130_1    pytorch
     tornado                   6.0.3            py37h516909a_0    conda-forge
     tqdm                      4.32.1                     py_0
     traitlets                 4.3.2                 py37_1000    conda-forge
     urllib3                   1.24.1                   py37_0
     wcwidth                   0.1.7                      py_1    conda-forge
     webencodings              0.5.1                      py_1    conda-forge
     werkzeug                  0.15.4                   pypi_0    pypi
     wheel                     0.33.1                   py37_0
     wrapt                     1.11.2                   pypi_0    pypi
     xz                        5.2.4                h14c3975_4
     yaml                      0.1.7                had09818_2
     zeromq                    4.3.2                he1b5a44_2    conda-forge
     zict                      1.0.0                    pypi_0    pypi
     zipp                      0.5.1                      py_0    conda-forge
     zlib                      1.2.11               h7b6447c_3
     zstd                      1.3.7                h0b5b093_0

</pre></details>

Additional context Add any other context about the problem here.

kkraus14 commented 5 years ago

@AK-ayush the plan is to replace the implementation of this function entirely with #1898 which will be much more performant.

kkraus14 commented 5 years ago

Given the plan is to move this to a libcudf function anyway, I'm going to close this as a duplicate.