sirmammingtonham / vector-borne-disease-analytics

Dataset and Code for 2021 IEEE International Conference on Big Data Paper - Scraping Unstructured Data to Explore the Relationship between Rainfall Anomalies and Vector-Borne Disease Outbreaks
https://ieeexplore.ieee.org/abstract/document/9671853
0 stars 0 forks source link

scrape_promed.py Malaria ValueError #1

Open UnitedBagels opened 2 years ago

UnitedBagels commented 2 years ago

The scraper doesn't seem to work for python scrape_promed.py malaria

I tested Dengue and Zika and those seem to work.

Fetching results page 0
Fetching results page 1
...
Fetching results page 14
Finished parsing post #1
Finished parsing post #2
...
Finished parsing post #210
Finished parsing post #211
Traceback (most recent call last):
  File "scrape_promed.py", line 80, in <module>
    get_posts(search_term)
  File "scrape_promed.py", line 75, in get_posts
    for _ in executor.map(get_post, post_ids.items()):
  File "C:\Users\Thomas\AppData\Local\Programs\Python\Python37-32\lib\concurrent\futures\_base.py", line 586, in result_iterator
    yield fs.pop().result()
  File "C:\Users\Thomas\AppData\Local\Programs\Python\Python37-32\lib\concurrent\futures\_base.py", line 425, in result
    return self.__get_result()
  File "C:\Users\Thomas\AppData\Local\Programs\Python\Python37-32\lib\concurrent\futures\_base.py", line 384, in __get_result
    raise self._exception
  File "C:\Users\Thomas\AppData\Local\Programs\Python\Python37-32\lib\concurrent\futures\thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "scrape_promed.py", line 69, in get_post
    [r['postinfo'][x] for x in r['postinfo'].keys() if x in COLUMNS]]
  File "C:\Users\Thomas\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\indexing.py", line 670, in __setitem__
    iloc._setitem_with_indexer(indexer, value)
  File "C:\Users\Thomas\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\indexing.py", line 1626, in _setitem_with_indexer
    self._setitem_with_indexer_missing(indexer, value)
  File "C:\Users\Thomas\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\indexing.py", line 1860, in _setitem_with_indexer_missing
    self.obj._mgr = self.obj.append(value)._mgr
  File "C:\Users\Thomas\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\frame.py", line 7751, in append
    sort=sort,
  File "C:\Users\Thomas\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\reshape\concat.py", line 287, in concat
    return op.get_result()
  File "C:\Users\Thomas\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\reshape\concat.py", line 503, in get_result
    mgrs_indexers, self.new_axes, concat_axis=self.bm_axis, copy=self.copy,
  File "C:\Users\Thomas\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\internals\concat.py", line 84, in concatenate_block_managers
    return BlockManager(blocks, axes)
  File "C:\Users\Thomas\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\internals\managers.py", line 149, in __init__
    self._verify_integrity()
  File "C:\Users\Thomas\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\internals\managers.py", line 329, in _verify_integrity
    raise construction_error(tot_items, block.shape[1:], self.axes)
ValueError: Shape of passed values is (202, 21), indices imply (201, 21)
sirmammingtonham commented 2 years ago

Hmm, I just ran the same command and it works for me. It might be a multithreading issue, so try setting max_workers=1 on line 74 and see if that helps.