CharlesJB / ENCODExplorer

5 stars 4 forks source link

downloadEncode(force=FALSE) has non-intuitive behaviour. #47

Open ericfournier2 opened 5 years ago

ericfournier2 commented 5 years ago

Calling downloadEncode with force=FALSE will not check that an existign file has a matching md5sum, and still report success.

Example:

    q_results = queryEncodeGeneric(biosample_name="A549", 
                                   file_type="bed narrowPeak", 
                                   target="BHLHE40")
    d_results = downloadEncode(q_results)
    d_result_files = gsub("Success downloading file : ", "", d_results)

    checkTrue(all(file.exists(d_result_files)))

    # Downlaod again with force=FALSE, should fail.
    file.remove(d_result_files)
    file.create(d_result_files)
    d_results = downloadEncode(q_results, force=FALSE)

will yield:

[1] "Success downloading file : ./ENCFF001VDM.bed.gz"
[1] "Success downloading file : ./ENCFF002COC.bed.gz"
[1] "Files can be found at C:/Dev/Projects/ENCODExplorer"

whereas the files ahve clearly not been downloaded, and are in fact corrupt.

The behaviour of force might not need toc hange, but its reporting should at least clearly spell out what happened ("Files were not dowloaded because they already exists". Checking if the existing file's md5 matches the expected one would be a nice bonus.