Closed phhoang98a closed 8 months ago
It may help to see what the code is that you wrote.
Also, when you say:
or even did not return anything
...is it returning a data frame with 0 items in, or is the data frame null/an error occurs?
@app.route("/job", methods=['POST']) def job(): job_title = request.json['job_title'] country = request.json['country'] location = request.json['location']
jobs: pd.DataFrame = scrape_jobs( site_name=["indeed"], search_term=job_title, location=location, results_wanted=10, country_indeed=country ) return { "job_url": jobs["job_url"].tolist(), "site": jobs["site"].tolist(), "title": jobs["title"].tolist(), "company": jobs["company"].tolist(), "location": jobs["location"].tolist(), "date_posted": jobs["date_posted"].tolist(), }, 200
This is my basic code, I sure the input params are true because It works in local. You can try to create a Flask API and deploy to see.
I'm using this on my own site usejobspy.com as a FastAPI in DigitalOcean and no issues. Indeed has banned a lot of the cloud providers ip's. You need to put a proxy on it.
I want to create an API to crawl the job. It worked well in local environment but It runs long or even did not return anything when I deployed the API to clouds like Render, Azure. I also tried Flask+Celery in local, but scrape_jobs function does not response anything.