diku-dk / bfast

GPU Implementation for BFAST
GNU General Public License v3.0
37 stars 17 forks source link

Error running BFASTMonitor example on a CPU: Build Program Failure in one env and kernel dies in another #19

Closed rbavery closed 3 years ago

rbavery commented 3 years ago

I'm trying to run the example on my Macbook Pro (no GPU) to test. I'm new to pyOpenCL but I read on the docs that it supports both CPU and GPU.

It is important to note that OpenCL is not restricted to GPUs. In fact, no special hardware is required to use OpenCL for computation–your existing CPU is enough. On Linux or macOS, type:

conda install pocl

to install a CPU-based OpenCL driver. On Windows, you may install e.g. the CPU OpenCL driver from Intel. On macOS, pocl can offer a marked robustness (and, sometimes, performance) improvement over the OpenCL drivers built into the operating system.

I installed pyopencl from conda-forge:

pyopencl 2018.2.5 py37h9888f84_0 conda-forge

And first tried to run the example with this installation:

import os
import wget
import numpy
from datetime import datetime

# download and parse input data
ifile_meta = "../data/bfast/peru_small/dates.txt"
ifile_data = "../data/bfast/peru_small/data.npy"

if not os.path.isdir("../data/bfast/peru_small"):
    os.makedirs("../data/bfast/peru_small")

if not os.path.exists(ifile_meta):
    url = 'https://sid.erda.dk/share_redirect/fcwjD77gUY/dates.txt'
    wget.download(url, ifile_meta)
if not os.path.exists(ifile_data):
    url = 'https://sid.erda.dk/share_redirect/fcwjD77gUY/data.npy'
    wget.download(url, ifile_data)

data_orig = numpy.load(ifile_data)
with open(ifile_meta) as f:
    dates = f.read().split('\n')
    dates = [datetime.strptime(d, '%Y-%m-%d') for d in dates if len(d) > 0]

from bfast.utils import crop_data_dates
start_hist = datetime(2002, 1, 1)
start_monitor = datetime(2010, 1, 1)
end_monitor = datetime(2018, 1, 1)
data, dates = crop_data_dates(data_orig, dates, start_hist, end_monitor)
print("First date: {}".format(dates[0]))
print("Last date: {}".format(dates[-1]))
print("Shape of data array: {}".format(data.shape))

from bfast import BFASTMonitor

model = BFASTMonitor(
            start_monitor,
            freq=365,
            k=3,
            hfrac=0.25,
            trend=False,
            level=0.05,
            backend='opencl',
            device_id=0,
        )
model.fit(data, dates, n_chunks=5, nan_value=-32768)

print("Detected breaks")
# -2 corresponds to not enough data for a pixel
# -1 corresponds to "no breaks detected"
# idx with isx>=0 corresponds to the position of the first break
print(model.breaks)

I get this error

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-1-6b81a4d59de0> in <module>
     45             device_id=0,
     46         )
---> 47 model.fit(data, dates, n_chunks=5, nan_value=-32768)
     48 
     49 print("Detected breaks")

~/opt/miniconda3/envs/pybayts/lib/python3.7/site-packages/bfast/models.py in fit(self, data, dates, n_chunks, nan_value)
    181                  verbose=self.verbose,
    182                  platform_id=self.platform_id,
--> 183                  device_id=self.device_id
    184                 )
    185 

~/opt/miniconda3/envs/pybayts/lib/python3.7/site-packages/bfast/monitor/opencl/base.py in __init__(self, start_monitor, freq, k, hfrac, trend, level, period, detailed_results, find_magnitudes, verbose, platform_id, device_id)
    110                                  default_tile_size=8,
    111                                  default_reg_tile_size=3,
--> 112                                  sizes=self._get_futhark_params())
    113 
    114     def _init_device(self, platform_id, device_id):

~/opt/miniconda3/envs/pybayts/lib/python3.7/site-packages/bfast/monitor/opencl/bfastfinal.py in __init__(self, command_queue, interactive, platform_pref, device_pref, default_group_size, default_num_groups, default_tile_size, default_reg_tile_size, default_threshold, sizes)
  48814                                                                            "value": None},
  48815                                         "mainMagnitude.tile_size_41656": {"class": "tile_size", "value": None},
> 48816                                         "remove_nans.segmap_group_size_29490": {"class": "group_size", "value": None}})
  48817     self.builtinzhreplicate_f32zireplicate_44709_var = program.builtinzhreplicate_f32zireplicate_44709
  48818     self.builtinzhreplicate_i32zireplicate_44718_var = program.builtinzhreplicate_i32zireplicate_44718

~/opt/miniconda3/envs/pybayts/lib/python3.7/site-packages/bfast/monitor/opencl/bfastfinal.py in initialise_opencl_object(self, program_src, command_queue, interactive, platform_pref, device_pref, default_group_size, default_num_groups, default_tile_size, default_reg_tile_size, default_threshold, size_heuristics, required_types, all_sizes, user_sizes)
    220         return cl.Program(self.ctx, program_src).build(
    221             ["-DLOCKSTEP_WIDTH={}".format(lockstep_width)]
--> 222             + ["-D{}={}".format(s.replace('z', 'zz').replace('.', 'zi').replace('#', 'zh'),v) for (s,v) in self.sizes.items()])
    223 
    224 def opencl_alloc(self, min_size, tag):

~/opt/miniconda3/envs/pybayts/lib/python3.7/site-packages/pyopencl/__init__.py in build(self, options, devices, cache_dir)
    508                         self._context, self._source, options_bytes, devices,
    509                         cache_dir=cache_dir, include_path=include_path),
--> 510                     options_bytes=options_bytes, source=self._source)
    511 
    512             if was_cached:

~/opt/miniconda3/envs/pybayts/lib/python3.7/site-packages/pyopencl/__init__.py in _build_and_catch_errors(self, build_func, options_bytes, source)
    552         # Python 3.2 outputs the whole list of currently active exceptions
    553         # This serves to remove one (redundant) level from that nesting.
--> 554         raise err
    555 
    556     # }}}

RuntimeError: clBuildProgram failed: BUILD_PROGRAM_FAILURE - clBuildProgram failed: BUILD_PROGRAM_FAILURE - clBuildProgram failed: BUILD_PROGRAM_FAILURE

Build on <pyopencl.Device 'Intel(R) Core(TM) i7-1068NG7 CPU @ 2.30GHz' on 'Apple' at 0x7fced9aea050>:

(options: -DLOCKSTEP_WIDTH=1 -Dbuiltinzhreplicate_f32zigroup_sizze_44712=32 -Dbuiltinzhreplicate_i32zigroup_sizze_44721=32 -DmainziRx_41192=3 -DmainziRx_41933=3 -DmainziRy_41193=3 -DmainziRy_41934=3 -DmainziTk_41189=8 -DmainziTk_41930=8 -DmainziTx_41044=8 -DmainziTx_41190=8 -DmainziTx_41931=8 -DmainziTy_41045=8 -DmainziTy_41191=8 -DmainziTy_41932=8 -Dmainzigroup_sizze_44243=32 -Dmainzisegmap_group_sizze_37619=32 -Dmainzisegmap_group_sizze_37797=32 -Dmainzisegmap_group_sizze_37925=32 -Dmainzisegmap_group_sizze_37957=32 -Dmainzisegmap_group_sizze_38004=32 -Dmainzisegmap_group_sizze_38489=32 -Dmainzisegmap_group_sizze_38654=32 -Dmainzisegmap_group_sizze_38708=32 -Dmainzisegmap_group_sizze_38775=32 -Dmainzisegmap_group_sizze_38869=32 -Dmainzisegmap_group_sizze_39049=32 -Dmainzisegmap_group_sizze_39190=32 -Dmainzisegmap_group_sizze_39322=32 -Dmainzisegmap_group_sizze_39603=32 -Dmainzisegmap_group_sizze_39678=32 -Dmainzisegmap_group_sizze_39827=32 -Dmainzisegmap_group_sizze_39929=32 -Dmainzisegmap_group_sizze_40076=32 -Dmainzisegmap_group_sizze_40200=32 -Dmainzisegmap_group_sizze_40570=32 -Dmainzisegmap_group_sizze_40712=32 -Dmainzisegmap_num_groups_37959=8 -Dmainzisegmap_num_groups_38006=8 -Dmainzisegmap_num_groups_39051=8 -Dmainzisegmap_num_groups_39192=8 -Dmainzisegmap_num_groups_39324=8 -Dmainzisegmap_num_groups_40714=8 -Dmainzisegred_group_sizze_38064=32 -Dmainzisegred_group_sizze_39111=32 -Dmainzisegred_group_sizze_39248=32 -Dmainzisegred_group_sizze_39378=32 -Dmainzisegred_group_sizze_39944=32 -Dmainzisegred_group_sizze_39965=32 -Dmainzisegred_group_sizze_40032=32 -Dmainzisegred_group_sizze_40116=32 -Dmainzisegred_group_sizze_40617=32 -Dmainzisegred_num_groups_38066=8 -Dmainzisegred_num_groups_39113=8 -Dmainzisegred_num_groups_39250=8 -Dmainzisegred_num_groups_39380=8 -Dmainzisegred_num_groups_39946=8 -Dmainzisegred_num_groups_39967=8 -Dmainzisegred_num_groups_40034=8 -Dmainzisegred_num_groups_40118=8 -Dmainzisegred_num_groups_40619=8 -Dmainzisegscan_group_sizze_39687=32 -Dmainzisegscan_group_sizze_40671=32 -Dmainzisegscan_num_groups_39689=8 -Dmainzisegscan_num_groups_40673=8 -Dmainzisuff_intra_par_11=128 -Dmainzisuff_intra_par_13=128 -Dmainzisuff_intra_par_24=235 -Dmainzisuff_intra_par_29=113 -Dmainzisuff_intra_par_34=122 -Dmainzisuff_outer_par_16=2000000000 -Dmainzisuff_outer_par_17=892448 -Dmainzisuff_outer_par_18=2000000000 -Dmainzisuff_outer_par_19=892448 -Dmainzisuff_outer_par_20=2000000000 -Dmainzisuff_outer_par_21=26215660 -Dmainzisuff_outer_par_28=2000000000 -Dmainzisuff_outer_par_31=111556 -Dmainzisuff_outer_par_6=2000000000 -Dmainzisuff_outer_par_7=2000000000 -Dmainzisuff_outer_par_8=7139584 -Dmainzitile_sizze_41656=8 -DmainDetailedziRx_41192=3 -DmainDetailedziRx_41933=3 -DmainDetailedziRy_41193=3 -DmainDetailedziRy_41934=3 -DmainDetailedziTk_41189=8 -DmainDetailedziTk_41930=8 -DmainDetailedziTx_41044=8 -DmainDetailedziTx_41190=8 -DmainDetailedziTx_41931=8 -DmainDetailedziTy_41045=8 -DmainDetailedziTy_41191=8 -DmainDetailedziTy_41932=8 -DmainDetailedzigroup_sizze_44256=32 -DmainDetailedzisegmap_group_sizze_29648=32 -DmainDetailedzisegmap_group_sizze_29826=32 -DmainDetailedzisegmap_group_sizze_29954=32 -DmainDetailedzisegmap_group_sizze_29986=32 -DmainDetailedzisegmap_group_sizze_30033=32 -DmainDetailedzisegmap_group_sizze_30518=32 -DmainDetailedzisegmap_group_sizze_30683=32 -DmainDetailedzisegmap_group_sizze_30737=32 -DmainDetailedzisegmap_group_sizze_30804=32 -DmainDetailedzisegmap_group_sizze_30898=32 -DmainDetailedzisegmap_group_sizze_31078=32 -DmainDetailedzisegmap_group_sizze_31219=32 -DmainDetailedzisegmap_group_sizze_31351=32 -DmainDetailedzisegmap_group_sizze_31632=32 -DmainDetailedzisegmap_group_sizze_31707=32 -DmainDetailedzisegmap_group_sizze_31856=32 -DmainDetailedzisegmap_group_sizze_31958=32 -DmainDetailedzisegmap_group_sizze_32105=32 -DmainDetailedzisegmap_group_sizze_32229=32 -DmainDetailedzisegmap_group_sizze_32480=32 -DmainDetailedzisegmap_group_sizze_32602=32 -DmainDetailedzisegmap_group_sizze_32659=32 -DmainDetailedzisegmap_group_sizze_33204=32 -DmainDetailedzisegmap_group_sizze_33256=32 -DmainDetailedzisegmap_group_sizze_33291=32 -DmainDetailedzisegmap_group_sizze_33412=32 -DmainDetailedzisegmap_num_groups_29988=8 -DmainDetailedzisegmap_num_groups_30035=8 -DmainDetailedzisegmap_num_groups_31080=8 -DmainDetailedzisegmap_num_groups_31221=8 -DmainDetailedzisegmap_num_groups_31353=8 -DmainDetailedzisegmap_num_groups_33414=8 -DmainDetailedzisegred_group_sizze_30093=32 -DmainDetailedzisegred_group_sizze_31140=32 -DmainDetailedzisegred_group_sizze_31277=32 -DmainDetailedzisegred_group_sizze_31407=32 -DmainDetailedzisegred_group_sizze_31973=32 -DmainDetailedzisegred_group_sizze_31994=32 -DmainDetailedzisegred_group_sizze_32061=32 -DmainDetailedzisegred_group_sizze_32145=32 -DmainDetailedzisegred_group_sizze_33317=32 -DmainDetailedzisegred_num_groups_30095=8 -DmainDetailedzisegred_num_groups_31142=8 -DmainDetailedzisegred_num_groups_31279=8 -DmainDetailedzisegred_num_groups_31409=8 -DmainDetailedzisegred_num_groups_31975=8 -DmainDetailedzisegred_num_groups_31996=8 -DmainDetailedzisegred_num_groups_32063=8 -DmainDetailedzisegred_num_groups_32147=8 -DmainDetailedzisegred_num_groups_33319=8 -DmainDetailedzisegscan_group_sizze_31716=32 -DmainDetailedzisegscan_group_sizze_33371=32 -DmainDetailedzisegscan_num_groups_31718=8 -DmainDetailedzisegscan_num_groups_33373=8 -DmainDetailedzisuff_intra_par_11=32 -DmainDetailedzisuff_intra_par_13=32 -DmainDetailedzisuff_intra_par_24=32 -DmainDetailedzisuff_intra_par_29=32 -DmainDetailedzisuff_intra_par_37=32 -DmainDetailedzisuff_outer_par_16=8 -DmainDetailedzisuff_outer_par_17=8 -DmainDetailedzisuff_outer_par_18=8 -DmainDetailedzisuff_outer_par_19=8 -DmainDetailedzisuff_outer_par_20=8 -DmainDetailedzisuff_outer_par_21=8 -DmainDetailedzisuff_outer_par_28=8 -DmainDetailedzisuff_outer_par_31=8 -DmainDetailedzisuff_outer_par_6=8 -DmainDetailedzisuff_outer_par_7=8 -DmainDetailedzisuff_outer_par_8=8 -DmainDetailedzitile_sizze_41656=8 -DmainMagnitudeziRx_41192=3 -DmainMagnitudeziRx_41933=3 -DmainMagnitudeziRy_41193=3 -DmainMagnitudeziRy_41934=3 -DmainMagnitudeziTk_41189=8 -DmainMagnitudeziTk_41930=8 -DmainMagnitudeziTx_41044=8 -DmainMagnitudeziTx_41190=8 -DmainMagnitudeziTx_41931=8 -DmainMagnitudeziTy_41045=8 -DmainMagnitudeziTy_41191=8 -DmainMagnitudeziTy_41932=8 -DmainMagnitudezigroup_sizze_44244=32 -DmainMagnitudezisegmap_group_sizze_33713=32 -DmainMagnitudezisegmap_group_sizze_33891=32 -DmainMagnitudezisegmap_group_sizze_34019=32 -DmainMagnitudezisegmap_group_sizze_34051=32 -DmainMagnitudezisegmap_group_sizze_34098=32 -DmainMagnitudezisegmap_group_sizze_34583=32 -DmainMagnitudezisegmap_group_sizze_34748=32 -DmainMagnitudezisegmap_group_sizze_34802=32 -DmainMagnitudezisegmap_group_sizze_34869=32 -DmainMagnitudezisegmap_group_sizze_34963=32 -DmainMagnitudezisegmap_group_sizze_35143=32 -DmainMagnitudezisegmap_group_sizze_35284=32 -DmainMagnitudezisegmap_group_sizze_35416=32 -DmainMagnitudezisegmap_group_sizze_35697=32 -DmainMagnitudezisegmap_group_sizze_35772=32 -DmainMagnitudezisegmap_group_sizze_35921=32 -DmainMagnitudezisegmap_group_sizze_36023=32 -DmainMagnitudezisegmap_group_sizze_36170=32 -DmainMagnitudezisegmap_group_sizze_36294=32 -DmainMagnitudezisegmap_group_sizze_36545=32 -DmainMagnitudezisegmap_group_sizze_36667=32 -DmainMagnitudezisegmap_group_sizze_36724=32 -DmainMagnitudezisegmap_group_sizze_37222=32 -DmainMagnitudezisegmap_group_sizze_37364=32 -DmainMagnitudezisegmap_num_groups_34053=8 -DmainMagnitudezisegmap_num_groups_34100=8 -DmainMagnitudezisegmap_num_groups_35145=8 -DmainMagnitudezisegmap_num_groups_35286=8 -DmainMagnitudezisegmap_num_groups_35418=8 -DmainMagnitudezisegmap_num_groups_37366=8 -DmainMagnitudezisegred_group_sizze_34158=32 -DmainMagnitudezisegred_group_sizze_35205=32 -DmainMagnitudezisegred_group_sizze_35342=32 -DmainMagnitudezisegred_group_sizze_35472=32 -DmainMagnitudezisegred_group_sizze_36038=32 -DmainMagnitudezisegred_group_sizze_36059=32 -DmainMagnitudezisegred_group_sizze_36126=32 -DmainMagnitudezisegred_group_sizze_36210=32 -DmainMagnitudezisegred_group_sizze_37269=32 -DmainMagnitudezisegred_num_groups_34160=8 -DmainMagnitudezisegred_num_groups_35207=8 -DmainMagnitudezisegred_num_groups_35344=8 -DmainMagnitudezisegred_num_groups_35474=8 -DmainMagnitudezisegred_num_groups_36040=8 -DmainMagnitudezisegred_num_groups_36061=8 -DmainMagnitudezisegred_num_groups_36128=8 -DmainMagnitudezisegred_num_groups_36212=8 -DmainMagnitudezisegred_num_groups_37271=8 -DmainMagnitudezisegscan_group_sizze_35781=32 -DmainMagnitudezisegscan_group_sizze_37323=32 -DmainMagnitudezisegscan_num_groups_35783=8 -DmainMagnitudezisegscan_num_groups_37325=8 -DmainMagnitudezisuff_intra_par_11=128 -DmainMagnitudezisuff_intra_par_13=2000000000 -DmainMagnitudezisuff_intra_par_24=235 -DmainMagnitudezisuff_intra_par_29=113 -DmainMagnitudezisuff_intra_par_37=122 -DmainMagnitudezisuff_outer_par_16=2000000000 -DmainMagnitudezisuff_outer_par_17=892448 -DmainMagnitudezisuff_outer_par_18=111556 -DmainMagnitudezisuff_outer_par_19=2000000000 -DmainMagnitudezisuff_outer_par_20=2000000000 -DmainMagnitudezisuff_outer_par_21=26215660 -DmainMagnitudezisuff_outer_par_28=2000000000 -DmainMagnitudezisuff_outer_par_31=111556 -DmainMagnitudezisuff_outer_par_6=2000000000 -DmainMagnitudezisuff_outer_par_7=2000000000 -DmainMagnitudezisuff_outer_par_8=7139584 -DmainMagnitudezitile_sizze_41656=8 -Dremove_nanszisegmap_group_sizze_29490=32 -I /Users/rave/opt/miniconda3/envs/pybayts/lib/python3.7/site-packages/pyopencl/cl)
(source saved as /var/folders/t5/9x0zlgbs2dz25fcjxvh9fpym0000gn/T/tmpqmd1y_4f.cl)

I then saw https://github.com/conda-forge/pyopencl-feedstock/issues/26#issuecomment-423554880 so I then installed pocl in my conda environment with

conda install -c conda-forge osx-pocl-opencl pocl pyopencl==2018.2.5

But when I ran the example again, I got a dead kernel error (I'm running it in a jupyter notebook).

Any tips on how to set up the environment on a CPU-only machine?

mortvest commented 3 years ago

Our implementation (the opencl backend) was tested for and targeted towards dedicated GPU's. I have a very limited knowledge of macos, but from what I have heard, the supported OpenCL version on macos was very old, and hence it is probably not going to work. If you want to use our implementation on a CPU-only machine, I would suggest using the "python-mp" backend instead. It is surprisingly fast on small-ish datasets.

rbavery commented 3 years ago

Thanks a bunch @mortvest , the python-mp backend works well.