Closed bountrisv closed 2 years ago
Hi bountrisv,
how many pods are you running simultaneously? Due to the specs of each pod I assume you want to process 1 image per pod? Not exactly sure what your setup is (and I have little to no knowledge with Kubernetes), but I think in this case you should rather directly use force-l2ps https://force-eo.readthedocs.io/en/latest/components/lower-level/level2/l2ps.html#level2-wrapper
As far as I remember force-level2 generates a TXT file (named cpu-
+ timestamp) which allows you to change the number of processes while the task is running. In your case, if you start a lot of pods at the same time it might actually generate two/multiple files with the same name and hence causes some kind of conflict (assuming the pods are using a common file system). But that's just a guess ...
Yep, pretty sure that is happening. force-level2
deletes that file when it is finished.
This program was never designed to be put in another parallel section.
You can
force-l2ps_
or force-l2ps
directly, which will get rid of one level of parallelism. I believe this would be the best solution overall. Especially, as each job only contains one image.Cheers, David
Hey all, thank for your responses!
I run from 10 to 65~ pods. If they can create the same filename that very well sounds like the problem behind it. Thanks for the proposed solutions.
Before I close the issue, one question regarding the last proposed solution:
Does force-l2ps take care of merging images as well, or would that have to be done manually?
Merging will be done, too!
I believe the only thing to really consider is whether to use l2ps or l2ps. Use l2ps if your input images are unpacked. Use l2ps if input images are still packed (zip/tar/tar.gz).
Another small difference that you should be aware of is the handling of the logfiles. force-l2ps simply write to stdout, while force-level2 redirects this stream to a logfile (DIR_LOG).
Cheers, David
Thank you for your fast and helpful responses, the issue is caused by the way I am executing force, and replacing force-level2 with force-l2ps works well in my case, so closing.
Note: Using an older version of force (and might have to stay there to make runtime experiments comparisons fairer)
Over multiple invocations of force-level2, seemingly randomly, the script will fail after correctly processing the image because of an inability to delete a log file.
Is there a way to catch or avoid the error? Is it because I am on a older version of FORCE?
Setup
force-level2 $PARAM_FILE
with 2 cpus, 4.5Gb of RAMINPUT/OUTPUT DIRECTORIES
------------------------------------------------------------------------
FILE_QUEUE = /data/outputs/queue_files/queue_0313.txt DIR_LEVEL2 = /data/outputs/level2_ard/ DIR_LOG = /data/outputs/level2_log/ DIR_TEMP = /data/outputs/level2_tmp/
DIGITAL ELEVATION MODEL
------------------------------------------------------------------------
FILE_DEM = /data/outputs/auxillary_data/dem/crete_srtm-aster.vrt DEM_NODATA = -32767
DATA CUBES
------------------------------------------------------------------------
DO_REPROJ = TRUE DO_TILE = TRUE FILE_TILE = /data/outputs/allowed_tiles.txt TILE_SIZE = 30000.000000 BLOCK_SIZE = 3000.000000 RESOLUTION_LANDSAT = 30 RESOLUTION_SENTINEL2 = 10 ORIGIN_LON = -25.000000 ORIGIN_LAT = 60.000000 PROJECTION = PROJCS["ETRS89 / LAEA Europe",GEOGCS["ETRS89",DATUM["European_Terrestrial_Reference_System_1989",SPHEROID["GRS 1980",6378137,298.257222101,AUTHORITY["EPSG","7019"]],TOWGS84[0,0,0,0,0,0,0],AUTHORITY["EPSG","6258"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4258"]],PROJECTION["Lambert_Azimuthal_Equal_Area"],PARAMETER["latitude_of_center",52],PARAMETER["longitude_of_center",10],PARAMETER["false_easting",4321000],PARAMETER["false_northing",3210000],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AUTHORITY["EPSG","3035"]] RESAMPLING = CC
RADIOMETRIC CORRECTION OPTIONS
------------------------------------------------------------------------
DO_ATMO = TRUE DO_TOPO = TRUE DO_BRDF = TRUE ADJACENCY_EFFECT = TRUE MULTI_SCATTERING = TRUE
WATER VAPOR CORRECTION OPTIONS
------------------------------------------------------------------------
DIR_WVPLUT = /data/outputs/auxillary_data/wvdb WATER_VAPOR = NULL
AEROSOL OPTICAL DEPTH OPTIONS
------------------------------------------------------------------------
DO_AOD = TRUE DIR_AOD = NULL
CLOUD DETECTION OPTIONS
------------------------------------------------------------------------
ERASE_CLOUDS = FALSE MAX_CLOUD_COVER_FRAME = 75 MAX_CLOUD_COVER_TILE = 75 CLOUD_BUFFER = 300 SHADOW_BUFFER = 90 SNOW_BUFFER = 30 CLOUD_THRESHOLD = 0.225 SHADOW_THRESHOLD = 0.02
RESOLUTION MERGING
------------------------------------------------------------------------
RES_MERGE = IMPROPHE
CO-REGISTRATION OPTIONS
------------------------------------------------------------------------
DIR_COREG_BASE = NULL COREG_BASE_NODATA = -9999
MISCELLANEOUS OPTIONS
------------------------------------------------------------------------
IMPULSE_NOISE = TRUE BUFFER_NODATA = FALSE
TIER LEVEL
------------------------------------------------------------------------
TIER = 1
PARALLEL PROCESSING
------------------------------------------------------------------------
Multiprocessing options (NPROC, DELAY) only apply when using the batch
utility force-level2. They are not used by the core function force-l2ps.
------------------------------------------------------------------------
NPROC = 1 NTHREAD = 4 PARALLEL_READS = FALSE DELAY = 3 TIMEOUT_ZIP = 30
OUTPUT OPTIONS
------------------------------------------------------------------------
OUTPUT_FORMAT = GTiff OUTPUT_DST = FALSE OUTPUT_AOD = FALSE OUTPUT_WVP = FALSE OUTPUT_VZN = FALSE OUTPUT_HOT = FALSE OUTPUT_OVV = TRUE
++PARAM_LEVEL2_END++```