Open kanishk-aidash opened 3 years ago
s2p should run without problems for input images of arbitrary size, even on a system with small memory. First, try setting the option "max_processes" to 1 in the config file and see if you get the same error.
If this doesn't solve the problem, can you share the config file and the contents of the folder "/.../pair1/" mentioned on your error message?
Hi @mnhrdt
Ran with max_processes = 1, still same error.
My config file:
{
"out_dir":"/Users/kanishkvarshney/Downloads/dsm_data_aidash/output_dir/08745eee-35c1-4bd6-8d93-becbb92e05ed/pair1",
"images":[
{
"img":"/Users/kanishkvarshney/Downloads/dsm_data_aidash/pair1/img1.TIF",
"rpc":"/Users/kanishkvarshney/Downloads/dsm_data_aidash/pair1/rpc1.XML"
},
{
"img":"/Users/kanishkvarshney/Downloads/dsm_data_aidash/pair1/img2.TIF",
"rpc":"/Users/kanishkvarshney/Downloads/dsm_data_aidash/pair1/rpc2.XML"
}
],
"full_img":true,
"dsm_resolution":0.5,
"disp_range_method":"sift",
"tile_size":600,
"horizontal_margin":20,
"vertical_margin":5,
"timeout":7200,
"clean_intermediate":true,
"matching_algorithm":"mgm_multi",
"mgm_timeout":7200,
"max_processes":1
}
../pair1/ folder contains the 1-band stereo pairs (geotiffs) and corresponding RPC.XMLs
Sorry, I meant "pair_1" not "pair1". It's the temporary folder where the particular tiles reside that triggered the error. Its full name appears in the subprocess call to mgm_multi. Something like ./output/tiles/row_XXX/col_XXX/pair_1
This should be a folder with two rectified small tiles of size 600x600 that you can share by zipping the whole directory.
@mnhrdt I am uploading zip (and error) for the new run (overrode the output directory with new runs). The flow breaks on random tile for stereo matching with same error.
This one is with 800x800 run. I have been trying different configurations for the run, but all of them break with the same error
You may have probably reached a memory limit... I have run this tile on my laptop and it takes almost 4GB of memory at one point. Can you try running the following command inside the "pair_1" folder and see what happens:
/path/to/your/install/of/s2p/bin/mgm_multi -r -109 -R 122 -S 6 -s vfit -t census -O 8 -P1 8.0 -P2 32.0 -confidence_consensusL rectified_disp_confidence.tif rectified_ref.tif rectified_sec.tif rectified_disp.tif
If it fails, you can try closing all other applications on your computer (the browsers may suffice) and then it may work. That would mean that it is indeed an out of memory error that we can try to solve, or at least try to get around.
Hey @mnhrdt The process was running on a 8GB, 4 core EC2 ubuntu20.04 machine dedicated to only stereo matching. Nothing else is running on that system. As aforementioned, the tiles on which the SIGABRT happens isn't consistent. The run can go OOM on any of the tiles during Stereo matching step of the algorithm
As a work around, I have triggered the algorithm on a New 32GB system with max_processes = 1, the process has been running for over 12+ hours now and only half the tiles (~600 / 1056) are stereo matched so far ...
The disparity range is not being estimated because of lack of sift matches. Could you try running the pipeline by setting the option: cfg['sift_match_thresh'] = 0.8 ?
On Thu, Apr 22, 2021 at 6:35 AM kanishk-aidash @.***> wrote:
Hey @mnhrdt https://github.com/mnhrdt The process is running on a 8GB, 4 core EC2 ubuntu20.04 machine dedicated to only stereo matching. Nothing else is running on that system.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cmla/s2p/issues/88#issuecomment-824531068, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGFI2US4IOFFTKPA5GKBS3TJ6RPNANCNFSM43FEJ4CQ .
@kanishk-aidash to speed things up while keeping the memory usage as low as possible during the stereo matching step, you can remove the max_processes
parameter from the input json file, and replace it with these two parameters:
"max_processes_stereo_matching": 1,
"omp_num_threads": 8,
Hey @carlodef It doesn't work .. With your settings it is going Out Of Memory error on a 32GB, 8-core box (this is a new box as aforementioned
Attaching the mgm_multi command pair_1.zip and config.json( Generated via s2p in the same folder)
Memory Usage (hitting 32GB ) :
Only successful run I have had so far is with 'max_processes' = 1 which takes around 20+ hours to run
@gfacciol I have set cfg['sift_match_thresh'] = 0.8 as well, But doesn't seem to help either
Well your process peaks at ~35 Gb of RAM [image: image.png]
The reason is that sift failed to find matches and so the disparity range is set to the maximum possible range, which is set as [-700, 217]!
SUBPIX=2 mgm_multi -r -700 -R 217 -S 6 -s vfit -t census -O 8 -P1 8 -P2 32 -confidence_consensusL conf.tif rectified_ref.tif rectified_sec.tif disp.tif
The actual range for this tile is approx [-200, 150]. The black image boundaries are not helping either, because they are processed at all scales trying to find a match.
To increase the probability of finding sift matches in this image (and reduce the range) you should set: cfg['sift_match_thresh'] = 0.8
This should limit the disparity range. We're working on a solution for this problem when no sift matches are found, but it's not integrated.
On Sun, Apr 25, 2021 at 8:27 AM kanishk-aidash @.***> wrote:
[image: Screenshot 2021-04-25 at 11 46 40 AM] https://user-images.githubusercontent.com/77284268/115982928-facae380-a5bb-11eb-9cdc-1f408ccc5719.png
Hey @carlodef https://github.com/carlodef It doesn't work .. With your settings it is going Out Of Memory error on a 32GB, 8-core box (this is a new box as aforementioned
Attaching the mgm_multi command pair_1.zip and config.json( Generated via s2p in the same folder)
config_json.txt https://github.com/cmla/s2p/files/6371369/config_json.txt
pair_1.zip https://github.com/cmla/s2p/files/6371361/pair_1.zip
[image: Screenshot 2021-04-25 at 11 49 17 AM] https://user-images.githubusercontent.com/77284268/115982976-467d8d00-a5bc-11eb-9f9e-148606f689c2.png
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/cmla/s2p/issues/88#issuecomment-826266638, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGFI2UMJX34H6TQLDUB5VTTKOY6PANCNFSM43FEJ4CQ .
@kanishk-aidash could you please add "use_srtm": true
the input config json file? This should help
sift_match_thresh
Hey @gfacciol I understand the issue to some extent. High memory consumption is expected for the MGM, but ~35GB is still bit too much. I have tried to run this process with single worker etc.
I have already tried the setting you have suggested
cfg['sift_match_thresh'] = 0.8
but to no avail. The last logs you are seeing is with this threshold only.
@kanishk-aidash could you please add
"use_srtm": true
the input config json file? This should help
@carlodef
Nope ... "use_srtm": true
doesn't work either.
The only thing that works so far is "max_processes" = 1
, which takes around 20+ hours.
I agree that's a lot of memory, we're working on a fix. Meanwhile I have a workaround for your case: It consists in changing the SUBPIX parameter of the correlator from 2 to 1.
env['SUBPIX'] = '2'
Here is the line in question: https://github.com/cmla/s2p/blob/f7540c0723e1613992f0d3aaae01db1b208e6a03/s2p/block_matching.py#L261
this should keep the memory usage within the 32 gb
Hey @gfacciol
This fix lets the code run without crashing. Takes around 16 hrs for a 20000x20000 single band stereo pair
In the generated DSM, I am seeing lot's of No Data though. Input geotiff resolution is 0.5. ![Uploading Screenshot 2021-04-28 at 5.00.23 PM.png…]()
Hi @kanishk-aidash , you attachment didn't work. But some holes are expected anyways given the density of the matching (which depends on the angle between the views). As a reference, this is the reconstruction on the tile you sent the other day
@gfacciol Updating the attachment
I expect some missing data, but here the final output sort of looks like salt-n-pepper
Ouch, that looks bad, but at this scale is hard to tell, because nans dilate after most subsampling operations. Can you zoom-in on some area to the scale of the resolution, and send it?
On Wed, Apr 28, 2021 at 4:13 PM kanishk-aidash @.***> wrote:
@gfacciol https://github.com/gfacciol Updating the attachment
I expect some missing data, but here the final output sort of looks like salt-n-pepper
[image: Screenshot 2021-04-28 at 7 41 45 PM] https://user-images.githubusercontent.com/77284268/116418449-c70ce980-a859-11eb-852a-7843a15282c2.png
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cmla/s2p/issues/88#issuecomment-828489647, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGFI2VRU6N5KF3MMSZPH3DTLAJX5ANCNFSM43FEJ4CQ .
Hey @gfacciol
Adding two pairs of clips Generated DSM and corresponding Google Tile (clipped from QGIS)
I agree that's a lot of memory, we're working on a fix. Meanwhile I have a workaround for your case: It consists in changing the SUBPIX parameter of the correlator from 2 to 1.
env['SUBPIX'] = '2'
Here is the line in question: https://github.com/cmla/s2p/blob/f7540c0723e1613992f0d3aaae01db1b208e6a03/s2p/block_matching.py#L261
this should keep the memory usage within the 32 gb
hey @gfacciol
Update... the fix you suggested reduces the memory, but on even bigger rasters (~30000x30000) it still crashes due to out of memory. Memory still shoots up to 32GB
Update 2:
cfg['use_srtm'] = True cfg['max_processes_stereo_matching'] = 1 cfg['omp_num_threads'] = 8 cfg['disp_min'] = -100 cfg['disp_min'] = 400 cfg['disp_range_method'] == 'fixed_pixel_range'
Setting 'max_disp_range' field gives the following error, if the disparity range is not l.t. the one provided via the config
Still running into OOM error
Hi,
After recent fixes, I am trying to run the s2p module on 1-band geotiffs (~50cm resolution). The size of rasters goes approximately 20000 x 20000. The process fails at stereo matching step with following errors:
This is coming from due to system going Out Of Memory.
System Details: Distributor ID: Ubuntu Description: Ubuntu 20.04.2 LTS Release: 20.04 Codename: focal Linux ip-172-31-3-184 5.4.0-1038-aws #40-Ubuntu SMP Fri Feb 5 23:50:40 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
I have tried changing time outs .. tile sizes etc , available matching algorithms etc .. all run into same issue ..
Do I need to add more memory? or any other suggested solution that i shall try out ?