Closed bw4sz closed 1 year ago
The workflow is doing what it's supposed to. There's an issue with the data:
(base) [ethanwhite@login1 everglades]$ ls -lh orthomosaics/2022/Vacation/Vacation_05_26_2022*
-rw-r--r-- 1 b.weinstein ewhite 3.9G Jan 26 13:55 orthomosaics/2022/Vacation/Vacation_05_26_2022_A.tif
-rw-r--r-- 1 b.weinstein ewhite 3.9G Jan 26 13:58 orthomosaics/2022/Vacation/Vacation_05_26_2022_B.tif
-rw-rw-r-- 1 b.weinstein ewhite 2.9G Jun 2 2022 orthomosaics/2022/Vacation/Vacation_05_26_2022.tif
There are actually two flights here, but they are 3 files. *2022_A.tif
and *2022.tif
are both getting processed as primary and so the birds are getting doubled.
Confirmed that Vacation_05_26_2022 and Vacation_05_26_2022_A are the same flight. I've deleted Vacation_05_26_2022.tif from Dropbox. Sadly I can't delete if from the HPG and the rule because I can't login...
Now that the HPG was fixed it's auth system I've deleted the extra file and am currently rerunning the workflow. Thanks for catching that @bw4sz! We should add a check for similar situations in the future. I'll open an issue.
Should be all fixed now
There are spatially duplicate predictions, but with unique IDs in Vacation 05/26/2022.
I download PredictedBirds.zip from
/blue/ewhite/everglades/EvergladesTools/App/Zooniverse/data
unzipped and overlaid the image on the drone data. There is one bird, but two records.To verify this is a pipeline and not a machine learning problem I downloaded
and there are no duplicates.
This is likely a merge issue somewhere when predicted_birds.zip is created?