camillegontier / DELAUNAY_dataset

MIT License
10 stars 4 forks source link

Dataset zip folder is empty #2

Closed shruum closed 2 years ago

shruum commented 2 years ago

Hi,

I wanted to download the dataset but the zipped folder from the link seems empty.

camillegontier commented 2 years ago

Hi, thanks for reaching us out. We might have an issue with the hosting of the images. Can you try to download them from the following link ? https://physiologie.unibe.ch/preprints.aspx

camillegontier commented 2 years ago

It indeed appears there is sometimes an issue with the .zip archive: download might be successful, but the .zip folder cannot be opened (a random error message appears). The issue is not always reproducible, but we are investigating it. Please feel free to share the exact error message you obtained. Also, while we are sorting this out, feel free to reach us out as camille [dot] gontier [at] unibe [dot] ch so that we can arrange a direct transfer link. Thanks !

shruum commented 2 years ago

Hi,

I still can't download it. I am attaching the error message I get. The log was too big, I just attacked a sample I wanted to use the dataset in the paper I am writing, it would be helpful if I can get access to it,

Thanks for your help. Shruthi

On Fri, Feb 18, 2022 at 11:50 AM Shruthi N Gowda @.***> wrote:

Hi,

I still can't download it. I am attaching the error message I get. I wanted to use it in the paper I am writing, it would be helpful if I can get access to it,

Thanks for your help. Shruthi

On Fri, Feb 18, 2022 at 10:11 AM Camille Gontier @.***> wrote:

It indeed appears there is sometimes an issue with the .zip archive: download might be successful, but the .zip folder cannot be opened (a random error message appears). The issue is not always reproducible, but we are investigating it. Please feel free to share the exact error message you obtained. Also, while we are sorting this out, feel free to reach us out as camille [dot] gontier [at] unibe [dot] ch so that we can arrange a direct transfer link. Thanks !

— Reply to this email directly, view it on GitHub https://github.com/camillegontier/DELAUNAY_dataset/issues/2#issuecomment-1044189205, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALA2PKCPOGYQZKZBTCROQB3U3YEKLANCNFSM5OOPQ7FA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you authored the thread.Message ID: @.***>

7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21 p7zip Version 16.02 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,64 bits,12 CPUs Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (906EA),ASM,AES-NI)

Scanning the drive for archives: 1 file, 4437877020 bytes (4233 MiB)

Listing archive: /data/input/datasets/Delaunay/delaunay.zip

-- Path = /data/input/datasets/Delaunay/delaunay.zip Type = zip ERRORS: Headers Error Physical Size = 4437877020 64-bit = +


Path = images/DELAUNAY Folder = + Size = 0 Packed Size = 0 Modified = 2022-02-01 14:31:20 Created = Accessed = Attributes = D Encrypted = - Comment = CRC = Method = Store Host OS = FAT Version = 20 Volume Index = 0

Path = images/DELAUNAY/Ad Reinhardt Folder = + Size = 0 Packed Size = 0 Modified = 2022-02-01 14:28:22 Created = Accessed = Attributes = D Encrypted = - Comment = CRC = Method = Store Host OS = FAT Version = 20 Volume Index = 0

Path = images/DELAUNAY/Ad Reinhardt/0.jpg Folder = - Size = 39372 Packed Size = 27727 Modified = 2022-01-21 18:42:12 Created = Accessed = Attributes = Encrypted = - Comment = CRC = 0B358748 Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/DELAUNAY/Ad Reinhardt/1.jpg Folder = - Size = 73457 Packed Size = 69933 Modified = 2022-01-21 18:42:12 Created = Accessed = Attributes = Encrypted = - Comment = CRC = E1866D0C Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/DELAUNAY/Ad Reinhardt/10.jpg Folder = - Size = 41821 Packed Size = 28734 Modified = 2022-01-21 18:42:12 Created = Accessed = Attributes = Encrypted = - Comment = CRC = 73736F61 Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/DELAUNAY/Ad Reinhardt/100.jpg Folder = - Size = 192258 Packed Size = 182209 Modified = 2022-01-21 18:42:12 Created = Accessed = Attributes = Encrypted = - Comment = CRC = 38235963 Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/DELAUNAY/Ad Reinhardt/101.jpg Folder = - Size = 120908 Packed Size = 109810 Modified = 2022-01-21 18:42:12 Created = Accessed = Attributes = Encrypted = - Comment = CRC = 38095215 Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/DELAUNAY/Ad Reinhardt/102.jpg Folder = - Size = 325601 Packed Size = 313440 Modified = 2022-01-21 18:42:14 Created = Accessed = Attributes = Encrypted = - Comment = CRC = 05E26117 Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/DELAUNAY/Ad Reinhardt/103.jpg Folder = - Size = 130986 Packed Size = 120952 Modified = 2022-01-21 18:42:14 Created = Accessed = Attributes = Encrypted = - Comment = CRC = 10B92B9C Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/DELAUNAY/Ad Reinhardt/104.jpg Folder = - Size = 231354 Packed Size = 221112 Modified = 2022-01-21 18:42:14 Created = Accessed = Attributes = Encrypted = - Comment = CRC = A67E0A95 Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/DELAUNAY/Ad Reinhardt/105.jpg Folder = - Size = 14636 Packed Size = 12631 Modified = 2022-01-21 18:42:14 Created = Accessed = Attributes = Encrypted = - Comment = CRC = 3401A6C2 Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/DELAUNAY/Ad Reinhardt/106.jpg Folder = - Size = 175620 Packed Size = 170329 Modified = 2022-01-21 18:42:14 Created = Accessed = Attributes = Encrypted = - Comment = CRC = 8A125333 Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/DELAUNAY/Ad Reinhardt/107.jpg Folder = - Size = 36098 Packed Size = 31501 Modified = 2022-01-21 18:42:14 Created = Accessed = Attributes = Encrypted = - Comment = CRC = F2B53076 Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/DELAUNAY/Ad Reinhardt/108.jpg Folder = - Size = 151701 Packed Size = 114543 Modified = 2022-01-21 18:42:14 Created = Accessed = Attributes = Encrypted = - Comment = CRC = 086A6B1A Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/DELAUNAY/Ad Reinhardt/109.jpg Folder = - Size = 41256 Packed Size = 39104 Modified = 2022-01-21 18:42:14 Created = Accessed = Attributes = Encrypted = - Comment = CRC = CDD4FBF2 Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/DELAUNAY/Ad Reinhardt/11.jpg Folder = - Size = 7028 Packed Size = 5198 Modified = 2022-01-21 18:42:12 Created = Accessed = Attributes = Encrypted = - Comment = CRC = DB306451 Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/DELAUNAY/Ad Reinhardt/110.jpg Folder = - Size = 3734072 Packed Size = 3593414 Modified = 2022-01-21 18:42:14 Created = Accessed = Attributes = Encrypted = - Comment = CRC = 698B9FC6 Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/DELAUNAY/Ad Reinhardt/111.jpg Folder = - Size = 332090 Packed Size = 319999 Modified = 2022-01-21 18:42:14 Created = Accessed = Attributes = Encrypted = - Comment = CRC = DDA948B6 Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/DELAUNAY/Ad Reinhardt/112.jpg Folder = - Size = 98881 Packed Size = 79224 Modified = 2022-01-21 18:42:14 Created = Accessed = Attributes = Encrypted = - Comment = CRC = EE9F048A Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/DELAUNAY/Ad Reinhardt/113.jpg Folder = - Size = 49585 Packed Size = 45383 Modified = 2022-01-21 18:42:14 Created = Accessed = Attributes = Encrypted = - Comment = CRC = 7AF918F0 Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/DELAUNAY/Ad Reinhardt/114.jpg Folder = - Size = 20225 Packed Size = 2749 Modified = 2022-01-21 18:42:14 Created = Accessed = Attributes = Encrypted = - Comment = CRC = FEE75A40 Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/DELAUNAY/Ad Reinhardt/115.jpg Folder = - Size = 107677 Packed Size = 101782 Modified = 2022-01-21 18:42:14 Created = Accessed = Attributes = Encrypted = - Comment = CRC = 0AFB365A Method = Deflate Host OS = FAT Version = 20

. . .

Modified = 2022-01-21 18:44:26 Created = Accessed = Attributes = Encrypted = - Comment = CRC = E1E87A35 Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/URLs/Naum Gabo.csv Folder = - Size = 30397 Packed Size = 10998 Modified = 2022-01-21 18:44:30 Created = Accessed = Attributes = Encrypted = - Comment = CRC = 1FA734E7 Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/URLs/Olle B‘rtling.csv Folder = - Size = 16363 Packed Size = 6886 Modified = 2022-01-21 18:44:34 Created = Accessed = Attributes = Encrypted = - Comment = CRC = 4A65D4B7 Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/URLs/Otto Freundlich.csv Folder = - Size = 19737 Packed Size = 7413 Modified = 2022-01-21 18:44:38 Created = Accessed = Attributes = Encrypted = - Comment = CRC = B1BF1C9D Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/URLs/Pierre Soulages.csv Folder = - Size = 18681 Packed Size = 7261 Modified = 2022-01-21 18:44:42 Created = Accessed = Attributes = Encrypted = - Comment = CRC = 70816C82 Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/URLs/Pierre Tal Coat.csv Folder = - Size = 21612 Packed Size = 8726 Modified = 2022-01-21 18:44:44 Created = Accessed = Attributes = Encrypted = - Comment = CRC = ADAD5816 Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/URLs/Piet Mondrian.csv Folder = - Size = 21180 Packed Size = 8793 Modified = 2022-01-21 18:44:48 Created = Accessed = Attributes = Encrypted = - Comment = CRC = 1800F2FE Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/URLs/Richard Paul Lohse.csv Folder = - Size = 21631 Packed Size = 8093 Modified = 2022-01-21 18:44:52 Created = Accessed = Attributes = Encrypted = - Comment = CRC = B778BBE4 Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/URLs/Roger BissiŠre.csv Folder = - Size = 20793 Packed Size = 8817 Modified = 2022-01-21 18:44:56 Created = Accessed = Attributes = Encrypted = - Comment = CRC = 26673872 Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/URLs/Sam Francis.csv Folder = - Size = 30264 Packed Size = 10286 Modified = 2022-01-21 18:45:02 Created = Accessed = Attributes = Encrypted = - Comment = CRC = E8CAB378 Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/URLs/Sonia Robert Delaunay.csv Folder = - Size = 18722 Packed Size = 8149 Modified = 2022-01-21 18:45:06 Created = Accessed = Attributes = Encrypted = - Comment = CRC = A4127B2E Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/URLs/Sophie Taeuber-Arp.csv Folder = - Size = 20776 Packed Size = 8055 Modified = 2022-01-21 18:45:10 Created = Accessed = Attributes = Encrypted = - Comment = CRC = C5F8D566 Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/URLs/Theo van Doesburg.csv Folder = - Size = 24380 Packed Size = 9577 Modified = 2022-01-21 18:45:12 Created = Accessed = Attributes = Encrypted = - Comment = CRC = AB6CA4C2 Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/URLs/Vassily Kandinsky.csv Folder = - Size = 43500 Packed Size = 12390 Modified = 2022-01-21 18:45:20 Created = Accessed = Attributes = Encrypted = - Comment = CRC = 1A725864 Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/URLs/Victor Vasarely.csv Folder = - Size = 34415 Packed Size = 10925 Modified = 2022-01-21 18:45:26 Created = Accessed = Attributes = Encrypted = - Comment = CRC = 0F361420 Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Path = images/URLs/Yves Klein.csv Folder = - Size = 12776 Packed Size = 5794 Modified = 2022-01-21 18:45:30 Created = Accessed = Attributes = Encrypted = - Comment = CRC = F788BDC5 Method = Deflate Host OS = FAT Version = 20 Volume Index = 0

Errors: 1 Volume Index = 0

camillegontier commented 2 years ago

Thanks for the details. From the error log, I understand you are working from Linux or Mac, right ? Can you try the following download links ? They contain respectively: The full data set The test set The train set The URLs https://physiologie.unibe.ch/supplementals/delaunay_1.zip

https://physiologie.unibe.ch/supplementals/delaunay_test.zip

https://physiologie.unibe.ch/supplementals/delaunay_train.zip

https://physiologie.unibe.ch/supplementals/URLs.zip

camillegontier commented 2 years ago

It is apparently common for .zip files too big not to be well handled when they are created and opened on 2 different OSs. Instead of a single .zip archive, we split it into 4 download links containing respectively the full dataset, the train set, the test set, and the URLs. The download links have been corrected on the main page of the GitHub project:

shruum commented 2 years ago

Thank you for your help. I was able to download the datasets.

Regards Shruthi

On Tue, Feb 22, 2022 at 10:14 AM Camille Gontier @.***> wrote:

It is apparently common for .zip files too big not to be well handled when they are created and opened on 2 different OSs. Instead of a single .zip archive, we split it into 4 download links containing respectively the full dataset, the train set, the test set, and the URLs. The download links have been corrected on the main page of the GitHub project:

— Reply to this email directly, view it on GitHub https://github.com/camillegontier/DELAUNAY_dataset/issues/2#issuecomment-1047580138, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALA2PKAP4KQA7ECNUYTLB3DU4NHYBANCNFSM5OOPQ7FA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you authored the thread.Message ID: @.***>

brando90 commented 2 years ago

@shruum @camillegontier are there python script to download the data sets? Thanks!

brando90 commented 2 years ago

some initial code:

def download_and_unzip(url: str, extract_to: Path = Path('~/data/tmp/')):
    extract_to: Path = expanduser(extract_to)
    extract_to = extract_to
    import ssl
    ctx = ssl.create_default_context()
    ctx.check_hostname = False
    ctx.verify_mode = ssl.CERT_NONE

    print("downloading dataset from ", url)
    import urllib
    response = urllib.request.urlopen(url, context=ctx)
    import tarfile
    file = tarfile.open(fileobj=response, mode="r|gz")
    print("extracting to ", extract_to)
    file.extractall(path=extract_to)
camillegontier commented 2 years ago

Good point, so far there is no download script. We can work on that. Thanks!

brando90 commented 2 years ago

Good point, so far there is no download script. We can work on that. Thanks!

I can do it. Will post in a few hours.

brando90 commented 2 years ago

Good point, so far there is no download script. We can work on that. Thanks!

this should work: https://github.com/brando90/ultimate-utils/blob/4135aa825762716e7c13583437297190dfa678b6/ultimate-utils-proj-src/uutils/__init__.py#L1672