This is currently not working, the file would always be flagged as duplicate from a db entry and the data_node will always stay the same as the first one added to the database.
Currently, the only workaround is manually deleting the files from database, I will explain how to do it with the following example:
Example
From a query abc123, some files did not download, since data node 'vesg.ipsl.upmc.fr' is down.
To find out which data node is down, you can use this python snippet:
from esgpull import Esgpull
from esgpull.models import FileStatus
esg = Esgpull(path="path/to/install")
query = esg.graph.get("abc123")
data_nodes = set(f.data_node for f in query.files if f.status != FileStatus.Done)
print(data_nodes)
In this example, this would print:
{'vesg.ipsl.upmc.fr'}
To delete the files that did not download, the snippet is very similar:
from esgpull import Esgpull
from esgpull.models import FileStatus
esg = Esgpull(path="path/to/install")
query = esg.graph.get("abc123")
missing_files = [f for f in query.files if f.status != FileStatus.Done]
esg.db.delete(*missing_files)
Now, you can create a new query this way. Updating it will now pick up another data_node for missing files:
This is currently not working, the file would always be flagged as duplicate from a db entry and the data_node will always stay the same as the first one added to the database.
Currently, the only workaround is manually deleting the files from database, I will explain how to do it with the following example:
Example
From a query
abc123
, some files did not download, since data node 'vesg.ipsl.upmc.fr' is down.To find out which data node is down, you can use this python snippet:
In this example, this would print:
To delete the files that did not download, the snippet is very similar:
Now, you can create a new query this way. Updating it will now pick up another data_node for missing files: