Closed emmaai closed 9 months ago
Hmm, a user recently emailed us about a kind of similar looking issue:
They're not doing geomedians, but are doing large-scale medians/means/stdev using Dask. The previous assumption was that this was caused by the S3 random data access problem we've talked about previously on Teams, but I wonder if it might actually be related to this issue too....
(or maybe it's completely different, hard to tell from those screenshots)
can confirm none of above fixed the issue, it happened again in 2015 geomedian test processing. http://dea-public-data-dev.s3-website-ap-southeast-2.amazonaws.com/?prefix=test/gm-ls8-dilation-6-cloud-opening-5-v2/3-0-0/x40/y13/2015--P1Y/
Thanks for your work on this @emmaai ! I know this was a tough one.
"Misplaced" chunks in geomedian as shown in the picture below. My current theory is that the graph unpack is triggered by
persist
before the graph built is completed, then something goes wrong in the middle. It's either a bug indask
, or we shouldn't do this at all. It seems to happen randomly. I don't have a reliable way to reproduce it.