Bunsly / JobSpy

Jobs scraper library for LinkedIn, Indeed, Glassdoor & ZipRecruiter
https://usejobspy.com
MIT License
767 stars 142 forks source link

Error in packages #76

Closed solidagold closed 9 months ago

solidagold commented 9 months ago

Image attached >> image I've been trying to download this package to no avail. The package is installed on my computer, but for some reason, this code doesn't run. Can I ask for help regarding this matter? Thank you.

DanielOX commented 9 months ago

@solidagold how many python versions are installed on your computer? try running the following whereis python command. You also need to verify which python interpreter is being used in pycharm. it might be the one where this package was not installed.

solidagold commented 9 months ago

I only had 3.8 and 3.10 installed - I removed the 3.8 one so the only python version left is 3.10, Furthermore, I checked the python interpreter part and it is running on 3.10.

This is the error I got when I tried running the code >> Traceback (most recent call last): File "C:\Users\user\PycharmProjects\jobsearch\jobfile.py", line 1, in <module> from jobspy import scrape_jobs ModuleNotFoundError: No module named 'jobspy'

DanielOX commented 9 months ago

can you try installing the package with pip command. open a new terminal and type python3.10 -m pip install python-jobspy. after than close the pycharm and reopen it. let me know if that works for you.

solidagold commented 9 months ago

Is that a command that works for windows? It says it's not a recognized internal or external command.

DanielOX commented 9 months ago

Yes, it is windows compatible command. can you try with python -m pip install python-jobspy? this time without mentioning python's version

solidagold commented 9 months ago

It works now! Thanks so much. But I ran the code and there's no output. Does it take time for the code to run, or is this a potential error? Thanks.

Edit: I got this output. `Traceback (most recent call last): File "C:\Users\user\PycharmProjects\jobsearch\venv\lib\site-packages\jobspy__init.py", line 86, in scrape_site scraped_data: JobResponse = scraper.scrape(scraper_input) File "C:\Users\user\PycharmProjects\jobsearch\venv\lib\site-packages\jobspy\scrapers\glassdoor__init__.py", line 116, in scrape location_id, location_type = self.get_location( File "C:\Users\user\PycharmProjects\jobsearch\venv\lib\site-packages\jobspy\scrapers\glassdoor\init__.py", line 177, in get_location raise GlassdoorException( jobspy.scrapers.exceptions.GlassdoorException: bad response status code: 403

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "C:\Users\user\PycharmProjects\jobsearch\venv\lib\site-packages\jobspy__init.py", line 114, in scrape_jobs site_value, scraped_data = future.result() File "C:\Program Files\Python310\lib\concurrent\futures_base.py", line 438, in result return self.get_result() File "C:\Program Files\Python310\lib\concurrent\futures_base.py", line 390, in get_result raise self._exception File "C:\Program Files\Python310\lib\concurrent\futures\thread.py", line 52, in run result = self.fn(*self.args, **self.kwargs) File "C:\Users\user\PycharmProjects\jobsearch\venv\lib\site-packages\jobspy__init__.py", line 105, in worker site_val, scraped_info = scrape_site(site) File "C:\Users\user\PycharmProjects\jobsearch\venv\lib\site-packages\jobspy\init__.py", line 97, in scrape_site raise GlassdoorException(str(e)) jobspy.scrapers.exceptions.GlassdoorException: bad response status code: 403

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "C:\Users\user\PycharmProjects\jobsearch\jobfile.py", line 3, in jobs = scrape_jobs( File "C:\Users\user\PycharmProjects\jobsearch\venv\lib\site-packages\jobspy__init.py", line 108, in scrape_jobs with ThreadPoolExecutor() as executor: File "C:\Program Files\Python310\lib\concurrent\futures_base.py", line 636, in exit__ self.shutdown(wait=True) File "C:\Program Files\Python310\lib\concurrent\futures\thread.py", line 229, in shutdown t.join() File "C:\Program Files\Python310\lib\threading.py", line 1089, in join self._wait_for_tstate_lock() File "C:\Program Files\Python310\lib\threading.py", line 1105, in _wait_for_tstate_lock elif lock.acquire(block, timeout): KeyboardInterrupt`

DanielOX commented 9 months ago

There seems to be an error with glassdoor, try removing glassdoor from site_name array.

like this site_name=["indeed", "linkedin", "zip_recruiter"]

DanielOX commented 9 months ago

@solidagold let me know if it worked?

DanielOX commented 9 months ago

This issue can be closed. please assign this issue to me if possible.

cc: @ZacharyHampton

solidagold commented 9 months ago

Oh sorry, just received the notification now. I'll try it ASAP.

solidagold commented 9 months ago

Works now, thanks so much!