Open mkolar opened 9 years ago
You could use the filecmp
package to compare whether files that are already there actually the same, but I think if the access time differs that would even state the files would be different when using shallow
argument. And otherwise it'll check whole files which might be slow.
I've used the snippet below before for similar purposes being "reliable" in a way that it made sense under those circumstances:
import os
def filesame(src, dest, tolerance=0.1):
"""Return whether src and dest are likely the same file.
This is making assumptions to be fast as opposed to
being completely accurate.
Files are the different when:
- Either of the two doesn't exist
- There's a difference in modified timestamp > tolerance
- They differ in file size
Arguments:
tolerance (float): tolerance for modified time
difference in seconds
Returns:
bool: Whether two files are considered to be the same
"""
if not os.path.exists(dest) or not os.path.exists(src):
return False
# absolute modified time difference in seconds
timediff = abs(os.stat(src).st_mtime - os.stat(new).st_mtime)
if timediff > tolerance:
return False
# size difference
if os.stat(oldPathFull).st_size != os.stat(newPath).st_size):
return False
return True
When using this ensure you copy the files with their stats so when copied they preserve their modified timestamp from the source file. I think shutil.copyfiile
does that.
See if output exists, extract it if not, publish already existing files if yes.
Useful for caches, renders, elements, pre-renders, models etc.