motherlode logs a successfull dataloader run in a node with the label LoadingLog
Based on the dockerhub_image_name and dockerhub_image_hash properties of this node motherlode decides if it need to import this dataloader again.
So when motherlode runs and when a dataloder in a certain version allready ran in the past, and the dataloader did not change, motherlode skips this dataloader.
At the moment motherlode saves the local hash id of an docker image, which is on every local docker installation different (at least it seems so to me).
Better would to save the hash id of the docker hub image. Then we would have a global consensuses about which dataloader allready ran.
motherlode logs a successfull dataloader run in a node with the label
LoadingLog
Based on thedockerhub_image_name
anddockerhub_image_hash
properties of this node motherlode decides if it need to import this dataloader again.So when motherlode runs and when a dataloder in a certain version allready ran in the past, and the dataloader did not change, motherlode skips this dataloader.
LoadingLog creation happens here: https://github.com/covidgraph/motherlode/blob/0bc0686f9de821113b88435c3717677833a355e8 /motherlode/main.py#L37
At the moment motherlode saves the local hash id of an docker image, which is on every local docker installation different (at least it seems so to me). Better would to save the hash id of the docker hub image. Then we would have a global consensuses about which dataloader allready ran.
It looks like we can obtain the docker image id with https://docker-py.readthedocs.io/en/stable/images.html#registrydata-objects