r-barnes / faster-unmixer

A faster implementation of the sediment inverse/unmixing scheme proposed in Lipp et al (2021).
5 stars 1 forks source link

Fix get_upstream_areas for networks with more than 255 subbasins #56

Closed AlexLipp closed 4 months ago

AlexLipp commented 9 months ago

Problem

The get_upstream_areas method of NetworkUnmixer reads in the labels.tif file from disk and uses this to assign an upstream area to each node in the network in python. However, to do this it currently uses the matplotlib function imread. Unfortunately, this expects every pixel value to be no greater than 255. As a result, if a network is loaded that has more than 255 sample sites the calculated sub-basin dictionary assigns any sample site with an ID of > 255 no upstream area (erroneously). For example, the following snippet, when run with this dataset: data.zip...

import funmixer
import numpy as np

sample_network, _ = funmixer.get_sample_graphs(
    flowdirs_filename="data/areas_test.asc",
    sample_data_filename="data/areas_test.data",
)

areas = funmixer.get_unique_upstream_areas(sample_network)

# Count the number of areas which are all empty
empty_samps = []
not_empty_samps = []
for samp, area in areas.items():
    if np.any(area):
        not_empty_samps.append(samp)
    else:
        empty_samps.append(samp)

print(f"Number of empty areas: {len(empty_samps)}")
print(f"Number of non-empty areas: {len(not_empty_samps)}")

returns:

Number of empty areas: 1519
Number of non-empty areas: 255

when it is expected to return:

Number of empty areas: 0
Number of non-empty areas: 1774.

Fix

This is fixed by substituting at Line 847, plt.imread for imageio.v2.imread. This requires the imageio package to be imported and declared in the setup.py but it is a pure python package so should not present installation issues.

r-barnes commented 4 months ago

Ideally, that list in setup.py is in alphabetical order.