a-maccormack / pegas-chile

Technology Job Listings
https://pegas-chile.vercel.app
14 stars 1 forks source link

Pegas Chile

Tailwind CSS Figma React TypeScript npm Next.js Python

🔧 Project Overview

This project involves scrapping job postings from a Telegram Channel and presenting them via a frontend built with Next.js. The jobs are categorized "manually" (with BeautifulSoup) and through the assistance of language models (LLMs).

Key Features

📸 Screenshots

Landing Internal

📂 Project Structure

Scraping Process

The scraping process targets the DCCEmpleo Telegram Channel, which regularly posts job listings for the Chilean tech scene. I scraped over 1,400 job posts, categorizing them with a combination of manual tagging and language model assistance (GPT-4 mini).

🤖 LLM Assistance

I used GPT-4 mini to assist in processing and categorizing job posts where manual effort was too time-consuming. You can view the LLM prompt I used here.

Frontend

The frontend is built using Next.js, allowing for easy static data API generation to serve the job data scraped during the first phase. The frontend consumes the JSON data and displays it through a clean, responsive UI.

API

The API is built through the Next.js static api. You can find it under /frontend/src/app/api

Contributing

If you want to submit a job post, create an issue including the following data for your job post submission:

{
  "sender": "<your-telegram-username-starting-with-an-@>",
  "contact_email": ["<your-email-(optional)>"],
  "contact_phone": ["<your-phone-(optional)>"],
  "links": ["<any-link-(optional)>"],
  "text": "<your-job-offer-text>",
  "date": {
    "day": "<current-day-number>",
    "month": "<current-month-number>",
    "year": "<current-year-number>",
    "time": "<current-time>"
  },
  "company_name": "<your-company-name>",
  "remote_work_policy": "<remote, hybrid, in person>",
  "employment_type": "<practica, fulltime, trabajo de título, part time>",
  "salary_range": {
    "currency": "<CLP, USD>",
    "min_bound": "<min-bound-for-position>",
    "max_bound": "<man-bound-for-position>"
  },
  "technologies": ["<any-tecnologies-the-applicant-should-know>"],
  "id": null
}