swisstopo / topo-satromo

Erdbeobachtungs-Satellitendaten fürs Trockenheitsmonitoring (SATROMO)
BSD 3-Clause "New" or "Revised" License
14 stars 3 forks source link

Not emptying task list and then tries tro reprocess file #95

Closed davidoesch closed 2 months ago

davidoesch commented 2 months ago

the publisher process starts, after S2-SR and VHI are exported correctly identifies Date: 2024-09-15 checking export status ... checking status of asset: ch.swisstopo.swisseo_s2-sr_v100_mosaic_2024-09-15T102559_bands-20m ... checking status of asset: ch.swisstopo.swisseo_vhi_v100_mosaic_2024-09-15T235959_forest-10m --> 2024-09-15 all assets exported and READY ...

now we are expecting its starts with VHI exporting but the next step is ch.swisstopo.swisseo_s2-sr_v100_mosaic_2024-09-15T102559_bands-20m starting processing ... ['df', '-h'] CompletedProcess(args=['df', '-h'], returncode=0, stdout='Filesystem Size Used Avail Use% Mounted on\n/dev/root 73G 53G 21G 72% /\ntmpfs 7.9G 172K 7.9G 1% /dev/shm\ntmpfs 3.2G 1.1M 3.2G 1% /run\ntmpfs 5.0M 0 5.0M 0% /run/lock\n/dev/sda15 105M 6.1M 99M 6% /boot/efi\n/dev/sdb1 74G 4.1G 66G 6% /mnt\ntmpfs 1.6G 12K 1.6G 1% /run/user/1001\ngeedrivePROD: 15G 4.7G 11G [32](https://github.com/swisstopo/topo-satromo/actions/runs/10933804434/job/30352859784#step:9:33)% /home/runner/work/topo-satromo/topo-satromo/localgdrive\n', stderr='') SUCCESS: merged ch.swisstopo.swisseo_s2-sr_v100_mosaic_2024-09-15T102559_bands-20m.tif ITEM object 2024-09-15t102559: creating ASSET object ch.swisstopo.swisseo_s2-sr_v100_mosaic_2024-09-15t102559_bands-20m.tif: does not exist preparing... TIF asset - Multipart upload Part 1/6 of File ch.swisstopo.swisseo_s2-sr_v100_mosaic_2024-09-15t102559_bands-20m.tif uploaded after attempt 1 it publishes everything correctly vor VHI and S2 SR

but in then fails at https://github.com/swisstopo/topo-satromo/actions/runs/10933804434/job/30352859784#step:9:372 with keeping file:ch.swisstopo.swisseo_vhi_v100_mosaic_2024-09-15T235959_metadata.json ... checking status of asset: ch.swisstopo.swisseo_s2-sr_v100_mosaic_2024-09-15T102559_bands-10m ... checking status of asset: ch.swisstopo.swisseo_s2-sr_v100_mosaic_2024-09-15T102559_registration-10m ... checking status of asset: ch.swisstopo.swisseo_s2-sr_v100_mosaic_2024-09-15T102559_masks-10m --> 2024-09-15 all assets exported and READY ... ch.swisstopo.swisseo_s2-sr_v100_mosaic_2024-09-15T102559_bands-20m starting processing ... with open(os.path.join( FileNotFoundError: [Errno 2] No such file or directory: 'processing/ch.swisstopo.swisseo_s2-sr_v100_mosaic_2024-09-15T102559_bands-20m_metadata.json' Error: Process completed with exit code 1.

because it obviously tries to process the s2-sr again.

this only happens on PROD

To Reproduce Steps to reproduce the behavior:

  1. Run an actual date
  2. See error https://github.com/swisstopo/topo-satromo/actions/runs/10933804434/job/30352859784

Expected behavior no error

Workaround since data is correct, this is only a log bug

BUT you have to manually delete the files in the https://github.com/swisstopo/topo-satromo/tree/main/processing and empty running_taks.csv

TO DO

to check WHY this occurs , probably it has to to with the grouped_files https://github.com/swisstopo/topo-satromo/blob/413a447d8a1fc15cd7dc4ad2550ceac2306aaf1e/satromo_publish.py#L778 it does not start with the first complete group. so thsi has to be implemented somwhere in https://github.com/swisstopo/topo-satromo/blob/413a447d8a1fc15cd7dc4ad2550ceac2306aaf1e/satromo_publish.py#L787 unique_filename_day this list has to be sorted and/or in https://github.com/swisstopo/topo-satromo/blob/413a447d8a1fc15cd7dc4ad2550ceac2306aaf1e/satromo_publish.py#L981 also the filname has to be removed from the "group" this might fix it

davidoesch commented 2 months ago

fixed on DEV in https://github.com/swisstopo/topo-satromo/commit/2392c86b6b8925a813cfad2fd4b77ea5e28d7241 and on MAIN in https://github.com/swisstopo/topo-satromo/commit/777a1fb192ef764a1cd08ec6199e0501b37d9b71