PolarGeospatialCenter / imagery_utils

Other
34 stars 10 forks source link

pgc_ortho.py new CSV argument list source type #34

Closed ehusby closed 3 years ago

ehusby commented 3 years ago

To aid with complex processing tasks, such as those with many source images where groups of images require different output projections, you can provide a *.csv CSV file as the src argument to pgc_ortho.py.

The CSV file must have a header row listing the names of script arguments corresponding to the argument values for each task (each image to be processed) listed in the rows below. For pgc_ortho.py, the first column must be the absolute path to the source image, and the corresponding column header must be "src" (with or without quotes -- single and double quotes are stripped the ends of all CSV elements).

Not all script arguments need to be listed in the CSV file. Script arguments provided via command line that aren't overloaded from the CSV file will be passed like normal to each image processing task.

One issue is that parent argument parser in ortho_functions.py makes the --epsg argument required, so as a workaround if src is a CSV file, you can provide --epsg 0 and as long as "epsg" is a field in the CSV file, it will let that pass. All unique EPSG codes in the CSV file are validated upfront using the utils.SpatialRef class.

ehusby commented 3 years ago

I've added all three of you as reviewers so whoever has time can test the change (or test normal usage of pgc_ortho.py to confirm it didn't break anything). I've tested normal usage myself and didn't come across any breakage.

Look over the changes as you see fit. I heavily documented the new, slightly-wacky yield_src_tasks function that powers this new per-task argument handling paradigm. Hopefully it makes sense, but would be good to get your feedback.

bagl0025 commented 3 years ago

I created a csv file with src, stretch, and epsg rows. I tested it in pycharm and it ran with no errors. I like the csv option!

bakkerbakker commented 3 years ago

This is a nice addition, but I am hitting an error when testing it on Nunatak, which could be an environment issue on my end. When running from the pgc conda env on my local machine via the terminal in PyCharm it executes as expected. When I try to run it on Nunatak with the --pbs flag I get the following error:

python "/home/bakke557/scratch/repos/imagery_utils/pgc_ortho.py" --format "GTiff" --gtiff-compression "lzw" --epsg 32718 --outtype "Byte" --stretch "rf" --resample "near" --pyramid-type "near" --threads 1 --scratch "/home/bakke557/scratch/task_bundles" --c "rf" "/mnt/pgc/data/scratch/jesse/projects/_testing/imagery/WV01_20161029181645_10200100561FCC00_16OCT29181645-P1BS-504501710050_01_P001.ntf" "/mnt/pgc/data/scratch/jesse/projects/_testing/ortho"
Traceback (most recent call last):
  File "/home/bakke557/scratch/repos/imagery_utils/pgc_ortho.py", line 14, in <module>
    from lib import ortho_functions, taskhandler, utils
  File "/mnt/pgc/data/scratch/jesse/repos/imagery_utils/lib/ortho_functions.py", line 17, in <module>
    from lib import taskhandler, utils
  File "/mnt/pgc/data/scratch/jesse/repos/imagery_utils/lib/utils.py", line 12, in <module>
    from collections.abc import Collection
ImportError: No module named abc

When I run that same import line in python from within the same pgc conda env I don't hit the error:

(pgc) [bakke557@nunatak:~/scratch/projects/_testing]$ python
Python 3.7.10 | packaged by conda-forge | (default, Feb 19 2021, 16:07:37) 
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from collections.abc import Collection
>>> 

Without the --pbs flag on nunatak it seems to execute without error.

bagl0025 commented 3 years ago

Do you have a bunch of red errors complaining about undefined modules? The ones in that error?

On Fri, May 21, 2021 at 2:12 PM jesse bakker @.***> wrote:

This is a nice addition, but I am hitting an error when testing it on Nunatak, which could be an environment issue on my end. When running from the pgc conda env on my local machine via the terminal in PyCharm it executes as expected. When I try to run it on Nunatak with the --pbs flag I get the following error:

python "/home/bakke557/scratch/repos/imagery_utils/pgc_ortho.py" --format "GTiff" --gtiff-compression "lzw" --epsg 32718 --outtype "Byte" --stretch "rf" --resample "near" --pyramid-type "near" --threads 1 --scratch "/home/bakke557/scratch/task_bundles" --c "rf" "/mnt/pgc/data/scratch/jesse/projects/_testing/imagery/WV01_20161029181645_10200100561FCC00_16OCT29181645-P1BS-504501710050_01_P001.ntf" "/mnt/pgc/data/scratch/jesse/projects/_testing/ortho" Traceback (most recent call last): File "/home/bakke557/scratch/repos/imagery_utils/pgc_ortho.py", line 14, in from lib import ortho_functions, taskhandler, utils File "/mnt/pgc/data/scratch/jesse/repos/imagery_utils/lib/ortho_functions.py", line 17, in from lib import taskhandler, utils File "/mnt/pgc/data/scratch/jesse/repos/imagery_utils/lib/utils.py", line 12, in from collections.abc import Collection ImportError: No module named abc

When I run that same import line in python from within the same pgc conda env I don't hit the error:

(pgc) @.***:~/scratch/projects/_testing]$ python Python 3.7.10 | packaged by conda-forge | (default, Feb 19 2021, 16:07:37) [GCC 9.3.0] on linux Type "help", "copyright", "credits" or "license" for more information.

from collections.abc import Collection

Without the --pbs flag on nunatak it seems to execute without error.

— You are receiving this because your review was requested. Reply to this email directly, view it on GitHub https://github.com/PolarGeospatialCenter/imagery_utils/pull/34#issuecomment-846187825, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKODQ6KKCU5CTPMYKJEZR7TTO2WC5ANCNFSM45FSX72Q .

ehusby commented 3 years ago

Ah, I know what the issue is @bakkerbakker. You'll need to swap out the module load gdal/2.1.3 in qsub_ortho.sh for source ~/.bashrc; conda activate pgc. The collections import uses python3-only syntax, so that's what's causing the error when the jobs try to run on the cluster with the python2 gdal/2.1.3 environment.

I've only really tested pgc_ortho.py with the pgc conda environment. Could you test that change for pgc_mosaic.py in your next runs of that on Nunatak, Jesse?

ehusby commented 3 years ago

I just changed the environment load command in qsub_ortho.sh to use the python3 pgc conda environment. If you would re-pull this branch and try again @bakkerbakker it should work now! Thanks for testing

bakkerbakker commented 3 years ago

@ehusby I have a test job submitted to the cluster with the qub_ortho.sh script and will let you know once it gets through the queue if I run into any other issues. Thanks for implementing the qsub fix.

bakkerbakker commented 3 years ago

The tests on the cluster worked after that qsub fix. Looks good.