GliderToolsCommunity / GliderTools

A toolkit for processing Seaglider base station NetCDF files: despiking, smoothing, outlier detection, backscatter, fluorescence quenching, calibration, gridding, interpolation.
https://glidertools.readthedocs.io
GNU Affero General Public License v3.0
69 stars 28 forks source link

Error in calc_fluorescence: numpy boolean subtract #189

Closed smwoodman closed 1 year ago

smwoodman commented 1 year ago

Describe the bug When running calc_fluorescence with return_figure = True, optics.quenching_report is called.

quenching_report has the line of code smin, smax = nanpercentile(z[2], [2, 98]), where z[2] is a boolean mask.

If using numpy>=1.20, then this line will cause the following error: "TypeError: numpy boolean subtract, the - operator, is not supported, use the bitwise_xor, the ^ operator, or the logical_xor function instead."

To Reproduce Specific error using numpy:

import numpy as np
np.percentile([True, False, False], q=0.5)

To reproduce error using GliderTools (in 'ci/environment.yml' conda env): Code:

import glidertools as gt

filenames = 'C:/SMW/Gliders_Moorings/Gliders/GliderTools/tests/data/p5420*.nc'
names = [
    'ctd_depth',
    'ctd_time',
    'ctd_pressure',
    'salinity',
    'temperature',
    'eng_wlbb2flvmt_Chlsig',
    'eng_wlbb2flvmt_wl470sig',
    'eng_wlbb2flvmt_wl700sig',
    'aanderaa4330_dissolved_oxygen',
    'eng_qsp_PARuV',
]
ds_dict = gt.load.seaglider_basestation_netCDFs(
    filenames, names, return_merged=True, keep_global_attrs=False)

merged = ds_dict['merged']
if 'time' in merged:
    merged = merged.drop(["time", "time_dt64"])

dat = merged.rename({
    'salinity': 'salt_raw',
    'temperature': 'temp_raw',
    'ctd_pressure': 'pressure',
    'ctd_depth': 'depth',
    'ctd_time_dt64': 'time',
    'ctd_time': 'time_raw',
    'eng_wlbb2flvmt_wl700sig': 'bb700_raw',
    'eng_wlbb2flvmt_wl470sig': 'bb470_raw',
    'eng_wlbb2flvmt_Chlsig': 'flr_raw',
    'eng_qsp_PARuV': 'par_raw',
    'aanderaa4330_dissolved_oxygen': 'oxy_raw',
})

# variable assignment for conveniant access
depth = dat.depth
dives = dat.dives
lats = dat.latitude
lons = dat.longitude
time = dat.time
pres = dat.pressure
temp = dat.temp_raw
salt = dat.salt_raw
par = dat.par_raw
bb700 = dat.bb700_raw
bb470 = dat.bb470_raw
fluor = dat.flr_raw

bbp_baseline, bbp_spikes = gt.processing.calc_backscatter(
    bb700, temp, salt, dives, depth, 700, 49, 3.217e-5, 
    spike_window=11, spike_method='minmax', iqr=2., profiles_ref_depth=300,
    deep_multiplier=1, deep_method='median', verbose=True)
dat['bbp700'] = bbp_baseline

flr_qnch, flr, qnch_layer, [fig1, fig2] = gt.processing.calc_fluorescence(
    fluor, dat.bbp700, dives, depth, time, lats, lons, 53, 0.0121,
    profiles_ref_depth=300, deep_method='mean', deep_multiplier=1,
    spike_window=11, spike_method='median', return_figure=True, 
    night_day_group=False, sunrise_sunset_offset=2, verbose=True)

Script output:

(test_env_glidertools) C:\SMW\Gliders_Moorings\Gliders\GliderTools>C:/Users/sam.woodman/AppData/Local/anaconda3/envs/test_env_glidertools/python.exe c:/SMW/Gliders_Moorings/Gliders/GliderTools/notebooks/numpy_boolean_subtract-error.py

DIMENSION: sg_data_point
{latitude, temperature, eng_wlbb2flvmt_Chlsig, ctd_pressure, ctd_time, ctd_depth, aanderaa4330_dissolved_oxygen, salinity, longitude, eng_wlbb2flvmt_wl470sig, eng_wlbb2flvmt_wl700sig}     
100%|███████████████████████████████████████████████████████| 14/14 [00:00<00:00, 148.70it/s]

DIMENSION: qsp2150_data_point
{ctd_time, eng_qsp_PARuV}
100%|███████████████████████████████████████████████████████| 14/14 [00:00<00:00, 227.98it/s]

Merging dimensions on time indicies: sg_data_point, qsp2150_data_point, 
==================================================
bb700:
        Removing outliers with IQR * 2.0: 12 obs
        Mask bad profiles based on deep values (depth=300m)
        Number of bad profiles = 5/28
        Zhang et al. (2009) correction
        Dark count correction
        Spike identification (spike window=11)

==================================================
Fluorescence
        Mask bad profiles based on deep values (ref depth=300m)
        Number of bad profiles = 9/28
        Dark count correction
        Quenching correction
        Spike identification (spike window=11)
        Generating figures for despiking and quenching report
Traceback (most recent call last):
  File "c:\SMW\Gliders_Moorings\Gliders\GliderTools\notebooks\numpy_boolean_subtract-error.py", line 59, in <module>
    flr_qnch, flr, qnch_layer, [fig1, fig2] = gt.processing.calc_fluorescence(
                                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\SMW\Gliders_Moorings\Gliders\GliderTools\glidertools\processing.py", line 598, in calc_fluorescence
    op.quenching_report(
  File "C:\SMW\Gliders_Moorings\Gliders\GliderTools\glidertools\optics.py", line 819, in quenching_report
    smin, smax = nanpercentile(z[2], [2, 98])
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\sam.woodman\AppData\Local\anaconda3\envs\test_env_glidertools\Lib\site-packages\numpy\lib\nanfunctions.py", line 1384, in nanpercentile
    return _nanquantile_unchecked(
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\sam.woodman\AppData\Local\anaconda3\envs\test_env_glidertools\Lib\site-packages\numpy\lib\nanfunctions.py", line 1563, in _nanquantile_unchecked
    return function_base._ureduce(a,
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\sam.woodman\AppData\Local\anaconda3\envs\test_env_glidertools\Lib\site-packages\numpy\lib\function_base.py", line 3823, in _ureduce
    r = func(a, **kwargs)
        ^^^^^^^^^^^^^^^^^
  File "C:\Users\sam.woodman\AppData\Local\anaconda3\envs\test_env_glidertools\Lib\site-packages\numpy\lib\nanfunctions.py", line 1582, in _nanquantile_ureduce_func
    result = _nanquantile_1d(part, q, overwrite_input, method)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\sam.woodman\AppData\Local\anaconda3\envs\test_env_glidertools\Lib\site-packages\numpy\lib\nanfunctions.py", line 1608, in _nanquantile_1d
    return function_base._quantile_unchecked(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\sam.woodman\AppData\Local\anaconda3\envs\test_env_glidertools\Lib\site-packages\numpy\lib\function_base.py", line 4555, in _quantile_unchecked
    return _ureduce(a,
           ^^^^^^^^^^^
  File "C:\Users\sam.woodman\AppData\Local\anaconda3\envs\test_env_glidertools\Lib\site-packages\numpy\lib\function_base.py", line 3823, in _ureduce
    r = func(a, **kwargs)
        ^^^^^^^^^^^^^^^^^
  File "C:\Users\sam.woodman\AppData\Local\anaconda3\envs\test_env_glidertools\Lib\site-packages\numpy\lib\function_base.py", line 4721, in _quantile_ureduce_func
    result = _quantile(arr,
             ^^^^^^^^^^^^^^
  File "C:\Users\sam.woodman\AppData\Local\anaconda3\envs\test_env_glidertools\Lib\site-packages\numpy\lib\function_base.py", line 4840, in _quantile
    result = _lerp(previous,
             ^^^^^^^^^^^^^^^
  File "C:\Users\sam.woodman\AppData\Local\anaconda3\envs\test_env_glidertools\Lib\site-packages\numpy\lib\function_base.py", line 4655, in _lerp
    diff_b_a = subtract(b, a)
               ^^^^^^^^^^^^^^
TypeError: numpy boolean subtract, the `-` operator, is not supported, use the bitwise_xor, the `^` operator, or the logical_xor function instead.

Expected behavior Updated code, so there is no error when using recent versions of numpy.

Since z[2] in quenching_report appears to (always?) be a boolean array, can smin and smax be 0 and 1, respectively? Or, to provide the most flexibility should there be no hard-coded min/max?

MartinMohrmann commented 1 year ago

Thank you so much for documenting this bug! I created a PR (#191) that will be merged soon. Your excellent example made it easy to address this bug. :)