Closed JariPekko closed 1 year ago
Hi Jari,
the error occurs when trying to add the scene that was just downloaded to the QUEUE file, so you probably don't need to worry about the download itself. It's not easy to say what the issue is with the information at hand. Does this only happen once? Can you specify which scene it was?
Thanks, Stefan
Hi Stefan,
thanks for the quick reply. So the download happens correctly but the QUEUE file is not updated correctly? I just counted the files in the download dir (6129) and the lines of the QUEUE file (5567) in case this is heplful.
I think so far the error happened only once each time i started the process. I don't know which scene caused it but i can give you all scenes i'm trying to download.
Sensor(s): TM, ETM, OLI
Tile(s): 171074,172074,172075,173074,173075,174074,174075,175074,175075,176073,176074,176075,177070,177071,177072,177073,177074,177075,178070,178071,178072,178073,178074,178075,179070,179072,179073,179074,179075,180072,180073
Date range: 1970-01-01 to 2023-01-04
Included months: 1,2,3,4,5,6,7,8,9,10,11,12
Cloud cover: 0% to 70%
20793 Landsat Level 1 scenes matching criteria found
10.97 TB data volume found
5850 product bundles found in output directory, 14943 not downloaded yet.
Remaining download size: 9.78 TB
Downloading: 1%|=> | 102/14909 [18:55<42:35:28, 10.36s/product bundle]
Downloading: 1%|== | 152/14909 [27:11<30:28:13, 7.43s/product bundle]
could this be a potential file conflict when parallelly downloading images?
@JariPekko Thanks, it looks like there is definitely an issue with writing the file queue. Please make sure to create the queue for processing yourself before starting the Level 2 processing.
@davidfrantz There is potential for this to happen in the current version. However, according to the traceback the issue here is that the callback function (called after downloading a scene) isn't getting the url passed on properly.
I have run several tests and was unfortunately not able to reproduce the issue.
However, the way that the force queue file is created has been reworked to make sure that there aren't conflicts due to parallel access of processes on the same file. Instead of using a callback, we now use multiprocessing.Queue and a dedicated process that listens for results of the other processes and writes the queue file.
@JariPekko maybe you can try to pull the latest davidfrantz/force:latest
image and let us know if that solves your issue? Thanks!
I pulled the latest image used it without changing anything else and it seems to be working as intended now. The download is going for 1h now with no error.
A small update to the process before the latest davidfrantz/force:latest
image:
The downloads did stop completely at some point. The process was still going but no download for several hours. After aborting manually and starting again it was always the same pattern:
Download works as intended -> after a short while the error message from above appears but download is still ongoing ->some time later the download stops
Thanks, and i'll post an update about how it went
I'm happy to report that the download went flawlessly and rather quickly. In one day the ~11TB were downloaded.
Though the QUEUE file didn't seem to update at all. I downloaded 20792 scenes (as requested minus 1) but the QUEUE file had only 5771 lines, which it had before using the new davidfrantz/force:latest
image. The QUEUE file had to be written manually afterwards.
Thanks a lot for your quick help!!
Thanks for the feedback Jari!
I also noticed that the download speed has improved by orders of magnitude. I hope there have been changes to the infrastructure and it will stay like this now.
Glad to hear the issue is solved! To be honest I'm a bit puzzled that the queue file wasn't updated in your case. This was tested successfully here and I also had someone contact me in private with the same issue who had no issues writing the queue file after the update. Was the file maybe locked by another process by any chance?
I'm new to Linux and may be overlooking something, but I can't think of another process that would have locked the QUEUE file. I stopped the download (force-level1-landsat search --download
) process from before the update. The only other command i did involving the QUEUE file was to count its lines sometimes wc -l queue.txt
.
For testing purposes i just downloaded another scene with a new QUEUE file and new directories. Now the QUEUE file was updated correctly.
Hi! I get the following error message when downloading Landsat images from USGS with force-level1-landsat search and i don't know if it's a problem.
Error message
I used the following command:
dforce force-level1-landsat search Landsat_tiles.txt images/ --cloudcover 0,70 --queue-file queue.txt --secret usgs_m2m_access.txt --download
Behaviour The download starts as expected and images are downloaded. After a few minutes the error message from above appears but the process is not aborted and the download continues.
Setup FORCE version 3.7.10 using Docker Ubuntu 20.04.5 LTS Linux Server 500G RAM, 80 CPUs
Question Do i have to worry? Is it just a warning that an URL didn't work?