razorx89 / roco-dataset

Radiology Objects in COntext (ROCO): A Multimodal Image Dataset
165 stars 17 forks source link

Windows Error when running fetch.py #6

Closed fjpa121197 closed 2 years ago

fjpa121197 commented 2 years ago

Hello, I wanted to know if someone can help me with the following issue regarding running the script fetch.py on Windows 10 using python 3.8.11

I get the following output and error message:

Configuration:
Subdirectory: images
Extraction directory: C:\Users\franc\AppData\Local\Temp\roco-dataset
Keep archives: False
Delete contents of extraction directory: True
Number of processes: 4
Number of download retries: 10
Fetching ROCO dataset images...
multiprocessing.pool.RemoteTraceback:

Traceback (most recent call last):
  File "C:\Users\franc\anaconda3\lib\multiprocessing\pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "C:\Users\franc\roco-dataset\scripts\fetch.py", line 131, in process_group
    result = download_archive(extraction_dir_name, archive_url,
  File "C:\Users\franc\roco-dataset\scripts\fetch.py", line 209, in download_archive
    return subprocess.call(['wget', '-nc', '-nd', '-c', '-q', '-P',
  File "C:\Users\franc\anaconda3\lib\subprocess.py", line 340, in call
    with Popen(*popenargs, **kwargs) as p:
  File "C:\Users\franc\anaconda3\lib\subprocess.py", line 854, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "C:\Users\franc\anaconda3\lib\subprocess.py", line 1307, in _execute_child
    hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] The system cannot find the file specified

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "scripts\fetch.py", line 330, in <module>
    for i, pmc_id in enumerate(pool.imap_unordered(process_group,
  File "C:\Users\franc\anaconda3\lib\multiprocessing\pool.py", line 868, in next
    raise value
FileNotFoundError: [WinError 2] The system cannot find the file specified

Not sure what Im doing wrong. Any ideas on how to solve the issue?

Thanks in advance

fjpa121197 commented 2 years ago

Also, is there a correct way of citing the paper if I want to use a sample of the dataset to display within a website?

I want to display some sample images within a website that allows a user to get concepts predictions based on radiology images, and since I dont want the user to google for them and save them, and then upload them. I want to put some sample images that can quickly test the concept prediction part.

saviola777 commented 2 years ago

I will try to reproduce the issue on my Windows machine tomorrow.

I'm not sure about the citation, the simplest thing would be to take the citation given in the README, along with a link to this repository. Does that not work for you?

fjpa121197 commented 2 years ago

Great, just wanted to make sure to cite correctly.

I did managed to get the script fetch.py to work, but I did have to install Ubuntu terminal for Windows, and run the script from there. And it seems that the images are being downloaded correctly.

saviola777 commented 2 years ago

Your issue was caused by wget not being installed or on the Path, I've added more meaningful error output in this case now. I ran into another issue when testing on Windows, which should also be fixed now.