Closed mcgarth closed 4 years ago
When running pyrate process in serial, aps correction runs without issue
(pyrate) [mcg547@r3764 PyRate_Medium_test]$pyrate process -f input_parameters.conf -r 1 -c 1
However when using the openmpi functionality there is a problem:
(pyrate) [mcg547@r3764 PyRate_Medium_test]$ mpirun -n 28 pyrate process -f input_parameters.conf -r 7 -c 4 +2s pyrate.main:INFO Verbosity set to INFO. +2s pyrate.core.algorithm:INFO Found 9 unique epochs in the 15 interferogram network +2s pyrate.process:INFO Finished converting phase_data to numpy in process 0 +2s pyrate.process:INFO Searching for best reference pixel location +2s pyrate.core.refpixel:INFO Setting up ref pixel computation +2s pyrate.core.refpixel:INFO Chipsize validation successful +2s pyrate.core.refpixel:INFO Ref pixel setup finished +2s pyrate.core.refpixel:INFO Saving ref pixel blocks +4s pyrate.core.refpixel:INFO Saved ref pixel blocks +4s pyrate.core.refpixel:INFO Ref pixel calculation started +4s pyrate.core.refpixel:INFO Filtering means during reference pixel computation +4s pyrate.process:INFO Selected reference pixel coordinate: (562, 424) +4s pyrate.process:INFO Calculating orbfit correction +4s pyrate.process:INFO Checking Orbital error correction status +4s pyrate.core.shared:INFO Calculating corrections +7s pyrate.core.orbital:INFO Removing orbital error using NETWORK correction method and degree=QUADRATIC +10s pyrate.process:INFO Finished Orbital error correction +10s pyrate.process:INFO Checking reference phase estimation status +10s pyrate.core.shared:INFO Calculating corrections +10s pyrate.process:INFO Computing reference phase via method 2 +11s pyrate.process:INFO Ref phase computed in process 0 +12s pyrate.process:INFO Finished reference phase estimation +12s pyrate.process:INFO Calculating minimum spanning tree matrix using NetworkX method +12s pyrate.process:INFO finished mst calculation for process 0 +14s pyrate.core.aps:INFO Checking APS correction status +14s pyrate.core.shared:INFO Calculating corrections +14s pyrate.core.aps:INFO Calculating time series via SVD method for spatio-temporal filter +14s pyrate.core.aps:INFO Calculating time series for tile 0 during aps correction +14s pyrate.core.algorithm:INFO Found 9 unique epochs in the 15 interferogram network Traceback (most recent call last): File "/home/547/mcg547/.virtualenvs/pyrate/bin/pyrate", line 11, in load_entry_point('Py-Rate==0.3.0.post3', 'console_scripts', 'pyrate')() File "/home/547/mcg547/repo/PyRate/pyrate/main.py", line 183, in main process_handler(args.config_file, args.rows, args.cols) File "/home/547/mcg547/repo/PyRate/pyrate/main.py", line 55, in process_handler process.process_ifgs(sorted(dest_paths), params, rows, cols) File "/home/547/mcg547/repo/PyRate/pyrate/process.py", line 373, in process_ifgs _wrap_spatio_temporal_filter(ifg_paths, params, tiles, preread_ifgs) File "/home/547/mcg547/repo/PyRate/pyrate/core/aps.py", line 56, in _wrap_spatio_temporal_filter tsincr = _calc_svd_time_series(ifg_paths, params, preread_ifgs, tiles) File "/home/547/mcg547/repo/PyRate/pyrate/core/aps.py", line 116, in _calc_svd_time_series tsincr_g = mpiops.run_once(_assemble_tsincr, ifg_paths, params, preread_ifgs, tiles, nvels) File "/home/547/mcg547/repo/PyRate/pyrate/core/mpiops.py", line 54, in run_once f_result = f(*args, **kwargs) File "/home/547/mcg547/repo/PyRate/pyrate/core/aps.py", line 129, in _assemble_tsincr _assemble_tiles(i, n, t, tsincr_g[:, :, i], params[cf.TMPDIR], 'tsincr_aps') File "/home/547/mcg547/repo/PyRate/pyrate/postprocess.py", line 177, in _assemble_tiles tsincr = np.load(file=tsincr_file) File "/home/547/mcg547/repo/PyRate/.eggs/numpy-1.16.4-py3.7-linux-x86_64.egg/numpy/lib/npyio.py", line 422, in load fid = open(os_fspath(file), "rb") FileNotFoundError: [Errno 2] No such file or directory: '/short/dg9/insar/PyRate_Medium_test/out/tmpdir/tsincr_aps_1.npy'
Perhaps that file tsincr_aps_1.npy has not finished being produced yet? Do we need a 'wait'
There were two causes for this error:
When running pyrate process in serial, aps correction runs without issue
However when using the openmpi functionality there is a problem:
Perhaps that file tsincr_aps_1.npy has not finished being produced yet? Do we need a 'wait'