WilhelmusLab / IceFloeTracker.jl

Julia package for ice floe tracker
https://wilhelmuslab.github.io/IceFloeTracker.jl/
MIT License
1 stars 2 forks source link

Landmasking before cloudmasking? #164

Open cpaniaguam opened 1 year ago

cpaniaguam commented 1 year ago
    I think we were discussing this at some point. Should the landmask be applied to the reflectance images first?

_Originally posted by @cpaniaguam in https://github.com/WilhelmusLab/IceFloeTracker.jl/pull/154#discussion_r1065959030_

cpaniaguam commented 1 year ago

Findings

It seems like applying the landmask to the reflectance images before building the cloudmasks adds quite a bit of overhead. For this sample, it's ~20% slower and uses quite a bit more memory. Below is the script I used.

using IceFloeTracker: load, create_cloudmask, float64, apply_landmask
using BenchmarkTools
matlab_landmask_file = raw"C:\Users\cpaniagu\Documents\IceFloeTracker.jl\test\test_inputs\matlab_landmask.png"

lm = BitMatrix(load(matlab_landmask_file))
dir = "images"
reflectance_imgs = [f for f in readdir(dir) if occursin(r"NE_", f)]
# 6-element Vector{String}:
#  "NE_Greenland.2020162.aqua.250m.tif"
#  "NE_Greenland.2020162.terra.250m.tif"
#  "NE_Greenland.2020163.aqua.250m.tif"
#  "NE_Greenland.2020163.terra.250m.tif"
#  "NE_Greenland.2020164.aqua.250m.tif"
#  "NE_Greenland.2020164.terra.250m.tif"

# warm up
img_ = float64.(load(joinpath(dir, reflectance_imgs[1])))
@time create_cloudmask(img_);
# 0.999295 seconds (60 allocations: 2.722 GiB)
@time apply_landmask(imgs[1], lm);
# 0.174748 seconds (2 allocations: 828.528 MiB)

# Load images
imgs = [float64.(load(joinpath(dir, i))) for i in reflectance_imgs]

# No landmasking
@time create_cloudmask.(imgs);
# 7.224003 seconds (358 allocations: 16.334 GiB, 13.31% gc time)

# Landmasking
@time create_cloudmask.([apply_landmask(i, lm) for i in imgs]);
# 8.611660 seconds (43.38 k allocations: 21.191 GiB, 15.77% gc time, 0.24% compilation time)
tdivoll commented 1 year ago

I agree, it adds the overhead. The next step is to create cloudmask for the subset of the image that is not landmasked. If the create cloudmask function can ignore that ~15% of the image that is already landmasked it might save resources in the end. I think it is still checking the area that is landmasked and processing all the pixels this way. I'm not sure how to mask out the area that is landmasked (in Julia), but I can keep looking into it.

cpaniaguam commented 1 year ago

We can revisit the matter in the future. For the moment I think I'll move on to the other steps in the pipeline.