Closed pierotofy closed 1 year ago
Happy to test split-merge. Other than bombing out granularity for other flags, anything I should be looking for with split?
Nothing too specific, test the happy path cases (local split, remote split), try changing some parameters, see if the rerun selection behaves the way you would expect it to if you were to choose rerun-from by hand.
Sounds good. Local split is running. It's been blissfully long since I've had a project that needed split.
Sanity check on my command:
docker run -ti --rm -v /smallfast_00/kakuma:/smallfast_00/kakuma odm_rerun --project-path /smallfast_00/kakuma/ Kakuma_13B_Flight_01_2 --orthophoto-resolution 30 --split 400 --sm-cluster http://192.168.whateverman:3002
Edit:
Something's wrong:
[INFO] sm_cluster: True
[INFO] sm_no_align: False
....
[INFO] split: 400
[INFO] split_image_groups: None
[INFO] split_overlap: 150
This seems fine, but nothing is happening on the ClusterODM node (task list
returns nothing) and matching started on the machine this is run from as though it were a sequential split-merge. Am I missing something?
I don't know if @pierotofy already included this validation, but one should only compare the current flag settings with the previous ones for the same project, please filter by project.
Unhappy path in question if not filtering by project:
user runs project 1 runs with flags A, B, C - he's now satisfied with the results
user goes back to project 2 (he's unhappy with this one's last output) and changes flag D to re-run
autorerun
will consider flags ABC used on project 1 for comparison with project 2 flag D and make a poor decision about where to restart from
EDIT: Does this feature have persistence? Is it saved on database/disk? Assuming ODM has a project table/entity, one extra column for the last flags used on json format might do the trick, just an idea.
This seems fine, but nothing is happening on the ClusterODM node (
task list
returns nothing) and matching started on the machine this is run from as though it were a sequential split-merge. Am I missing something?
Still not sure what's happening here, need to rerun and monitor for uploading to cluster which doesn't appear to be happening, but now running with sm-cluster set through webodm, and that's working fine.
Still not sure what's happening here
Not sure either, this PR has not modified the sm_cluster logic. Can you post the task output?
Will do after tests are complete.
Retraction, the above command is working fine. Should be done with testing this morning for distributed split-merge.
Just to verify: I don't think I can test yet in WebODM, correct? There's no way to re-run without setting a stage for rerun.
I don't think I can test yet in WebODM, correct? There's no way to re-run without setting a stage for rerun.
Correct, WebODM will require some modification.
Ah, the brittlness of split-merge. Still waiting for a successful completion. Should have chosen a smaller dataset. Hopefully something today.
Still not sure what's happening here
Not sure either, this PR has not modified the sm_cluster logic. Can you post the task output?
Circling back, this was a dataset that required a geo.txt but I forgot to include that, so split couldn't happen based on image tags.
Now running a different (smaller) dataset anyway.
First command:
docker run -ti --rm -v /smallfast_00/kakuma:/smallfast_00/kakuma odm_rerun --project-path /smallfast_00/kakuma/ Kakuma_13B_Flight_03 --split 400 --sm-cluster http://192.168.20.24:3002
Second command run in sequence (no rerun triggered, addition of orthophoto flag ignored):
docker run -ti --rm -v /smallfast_00/kakuma:/smallfast_00/kakuma odm_rerun --project-path /smallfast_00/kakuma/ Kakuma_13B_Flight_03 --split 400 --sm-cluster http://192.168.20.24:3002 --orthophoto-resolution 1
Third command run in sequence (rerun gets triggered):
docker run -ti --rm -v /smallfast_00/kakuma:/smallfast_00/kakuma odm_rerun --project-path /smallfast_00/kakuma/ Kakuma_13B_Flight_03 --split 200 --sm-cluster http://192.168.20.24:3002
Second command run in sequence (no rerun triggered, addition of orthophoto flag ignored):
Mm, typically one would need to rerun the entire pipeline --rerun-from split
here. But the fact that the rerun wasn't triggered is not bad either, it just matches the current ODM logic, which is good, nothing breaks.
This gets a bit tricky, I think most parameters in split-merge need to re-trigger the split
stage, but there could be exceptions, e.g. if you run split-merge with --dsm
and then without --dsm
, then technically you should just rerun the merge
stage, rerunning the entire pipeline would not be the expected behavior.
I'm starting to think that split-merge, as an advanced feature, might just need to continue having the manual --rerun-from
input from users.
A good question is: what are the cases when changing a parameter and using split-merge should not result in the entire pipeline to be re-run from the beginning? (Most parameters will require a rerun)
As currently implemented, split-merge is an expert only tool, so I don't necessarily mind if it stays as-is, although defaulting to rerunning for most parameter changes would be preferable. If I do the split-merge re-write, I'll need help thinking through how it interacts with this pull request, though I probably won't have bandwidth for the at rewrite until I get the org through the first end of year reporting.
I'll change it so that if any parameter is modified and --split
is set, the pipeline will rerun from split (unless explicitly overriden via --rerun-*
). Thanks for helping with testing.
Sounds good.
I was on the forum and I saw this post that mentioned this PR. Is this implemented in NodeODM?
Implements auto-rerun logic. :boom:
E.g.
./run.sh /datasets/brighton
[process completes]./run.sh /datasets/brighton --orthophoto-resolution 2
(change resolution)rerun-from: odm_orthophoto
If
rerun-from
,rerun-all
orrerun
is set, the flag is respected instead (does not automatically attempt to choose the rerun stage).Please help test this, especially with split-merge and distributed split-merge? :pray:
1718 #1719