microsoft / LLaVA-Med

Large Language-and-Vision Assistant for Biomedicine, built towards multimodal GPT-4 level capabilities.
Other
1.29k stars 148 forks source link

Released files: 404 error #13

Closed SinanAkkoyun closed 7 months ago

SinanAkkoyun commented 7 months ago

Thank you so so much for releasing all the data and models!

I get the following error:

Downloading PMC articles
  0%|                                                                                            | 916/721154 [33:49<443:15:00,  2.22s/it]
Traceback (most recent call last):
  File "llava/data/download_images.py", line 43, in <module>
    main(args)
  File "llava/data/download_images.py", line 19, in main
    urllib.request.urlretrieve(sample['pmc_tar_url'], os.path.join(args.pmc_output_path, os.path.basename(sample['pmc_tar_url'])))
  File "/usr/lib/python3.8/urllib/request.py", line 247, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/usr/lib/python3.8/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.8/urllib/request.py", line 531, in open
    response = meth(req, response)
  File "/usr/lib/python3.8/urllib/request.py", line 640, in http_response
    response = self.parent.error(
  File "/usr/lib/python3.8/urllib/request.py", line 569, in error
    return self._call_chain(*args)
  File "/usr/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.8/urllib/request.py", line 649, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found
SinanAkkoyun commented 7 months ago

And also, this is the LlaVa Med 2 release, right?

ChunyuanLI commented 7 months ago

And also, this is the LlaVa Med 2 release, right?

No, this is the original LLaVA-Med release, after going through 5-month intensive discussions to comply with Microsoft release policy.

SinanAkkoyun commented 7 months ago

No, this is the original LLaVA-Med release, after going through 5-month intensive discussions to comply with Microsoft release policy.

@ChunyuanLI I see, thank you so so much for going through the trouble!

Kingofolk commented 3 months ago

Hi, have you solved this problem? I encountered a similar problem. When using download_images.py to download PMC articals, the downloading process stopped at 916/721145 unexpectedly. To be more clearly, please see the following snapshot. image Looking forward for your reply. Thanks a lot.