raissonsouto / pijama2json

A script that scraps data about UFCG courses offered from a PDF and turns into JSON.
1 stars 2 forks source link
json pdf python scraper scraping scraping-python script

Pijama2json

Pijama2json is a Python data scraper that specializes in extracting information from PDFs known as "pijamas" within the community. These PDFs contain details about the disciplines offered in different courses at UFCG. The tool converts this extracted data into a convenient JSON format for streamlined accessibility and utilization.

Help improving it

Contributions to this project are welcome. If you find a bug or have an idea for a new feature, please open an issue or submit a pull request.

Run it on your own machine

To run this application on your local machine, you will need to have Python and Pip installed.

After installing Python, you can follow these steps to install the application:

  1. Clone this repository to your local machine
  2. Navigate to the project's root directory
  3. Create a virtual environment by running the command python -m venv venv
  4. Activate the venv by running:
    • Windows: venv\Scripts\activate.ps1
    • Linux or Mac: source venv/bin/activate
  5. Install the required packages by running the command pip install -r requirements.txt
  6. Run the command python app.py.