Closed mhotalebi closed 5 months ago
In case of network issues, you can try downloading the orbits sequentially instead of using multiple parallel downloads (8 by default):
S1.download_orbits(..., n_jobs=1)
Thanks for your response.
I also have another problem. My model runs perfectly until it reaches sbas.sync_cube(stl_sbas, 'stl_sbas'). I have waited for more than a day, but it doesn't work, and nothing happens. I also removed this line and run the rest of the code, but it can't calculate this line and nothing happened on it:
zmin, zmax = np.nanquantile(velocity_sbas, [0.01, 0.99]) Sometimes, I see memory warnings that indicate my memory limit is 4, and I'm almost reaching that amount of memory.
I actually changed the Dask cluster configuration in my cloud to the following, but it didn't solve the problem:
import dask, dask.distributed
dask.config.set({'distributed.comm.timeouts.tcp': '60s'}) dask.config.set({'distributed.comm.timeouts.connect': '60s'})
dask.config.set({'distributed.worker.memory.target': 0.75}) dask.config.set({'distributed.worker.memory.spill': 0.85}) dask.config.set({'distributed.worker.memory.pause': 0.90}) dask.config.set({'distributed.worker.memory.terminate': 0.98})
from dask.distributed import Client, LocalCluster
if 'client' in globals(): client.close() if 'cluster' in globals(): cluster.close()
cluster = LocalCluster(n_workers=3, threads_per_worker=2) client = Client(cluster) client
i am finally change cluster to this and running the model and hope it is work
Usually, we try to have 4GB+ RAM for every worker, and my example configuration uses 3 workers on 16GB RAM providing about 5.5GB per worker. You can allocate less RAM per worker but sometimes it can be not sufficient.
I'm experiencing an issue with the
s1.download_orbit
function in my project. When I attempt to download orbits for more than 30 scenes, I receive the following error:This issue occurs regardless of the network I use, and I have tried multiple different networks to rule out connectivity issues on my end. It seems like the connection is being reset by the peer when handling larger batches. This could potentially be a timeout or resource limit issue on the server side. i also use this code but it didnt work either:
scan the data directory for SLC scenes and download missed orbits
import time max_attempts = 5 # Maximum number of attempts attempt = 0 # Current attempt success = False # Flag to indicate success while attempt < max_attempts and not success: try: attempt += 1 print(f"Attempt {attempt} of {max_attempts}") S1.download_orbits(DATADIR, S1.scan_slc(DATADIR2)) success = True # If download succeeds, set success to True print("Download successful.") except Exception as e: # Catch any exception (consider specifying exact exceptions) print(f"An error occurred: {e}") time.sleep(5) # Wait for 5 seconds before retrying (to avoid hammering the server) if not success: print("Failed to download after maximum attempts.")