This repository contains a web scraping project that extracts real estate data from the Makaan website. The goal of this project is to gather property information such as property name, location, price, area, and other relevant details from Makaan's website and store it in a structured format for further analysis.
Makaan is a popular real estate website that lists properties for sale and rent across various locations. Web scraping allows us to automatically gather data from the website, which can be useful for market research, analysis, or other real estate-related purposes.
This project uses Python and several libraries to perform web scraping. The primary library used for web scraping is BeautifulSoup
, which is a powerful library for extracting information from HTML and XML documents. Additionally, we use the requests
library to fetch the website's content.
To run this project, you'll need to have Python installed on your system. If you don't have it installed, you can download it from the official website: https://www.python.org/downloads/
Once you have Python installed, you can clone this repository to your local machine using the following command:
git clone https://github.com/your-username/makaan-real-estate-scraper.git
Next, navigate to the project directory:
cd makaan-real-estate-scraper
To install the required Python dependencies, run the following command:
pip install -r requirements.txt
To start scraping real estate data from Makaan, you need to provide the URL of the Makaan search results page you want to scrape. You can customize the search filters on the website (e.g., location, budget, property type) to narrow down the search results.
In the makaan_scraper.py
script, modify the URL
variable with the desired Makaan search results URL:
URL = "https://www.makaan.com/your-search-url"
After setting the URL, you can run the scraper using the following command:
python makaan_scraper.py
The script will fetch the data from the Makaan website, extract the relevant information, and store it in a CSV file named makaan_data.csv
.
The project structure is organized as follows:
makaan-real-estate-scraper/
|-- makaan_scraper.py
|-- requirements.txt
|-- README.md
makaan_scraper.py
: The main Python script responsible for scraping the Makaan website and saving the data to a CSV file.requirements.txt
: Contains a list of Python packages required for the project. You can install them using pip
.README.md
: The file you are currently reading, providing an overview of the project and its usage.The following Python libraries are used in this project:
requests
: Used for fetching the HTML content of the website.beautifulsoup4
: Used for parsing the HTML content and extracting relevant data.pandas
: Used for organizing and saving the data in a structured format.Make sure to install these libraries before running the script using the command mentioned in the Installation section.
Contributions to this project are welcome! If you find any issues or have ideas for improvements, feel free to open an issue or submit a pull request.
When contributing, please follow the existing code style and ensure that your changes do not break existing functionality. Also, consider adding appropriate comments and documentation for new code.
This project is licensed under the MIT License, which allows you to use, modify, and distribute the code for both personal and commercial purposes. See the LICENSE
file for more details.
Happy web scraping and real estate data analysis!
Please customize the above guide with your actual repository's content, and provide a comprehensive explanation of the project, its dependencies, and how to use it. Additionally, don't forget to include the proper licensing information and guidelines for contributing to the project.