urbanogilson / SICAR

This tool is designed for students, researchers, data scientists or anyone who would like to have access to SICAR files
https://urbanogilson.github.io/posts/sicar/
MIT License
76 stars 35 forks source link
brazil geospatial opendata python shapefile sicar

SICAR

This tool is designed for students, researchers, data scientists, or anyone who would like to have access to SICAR files.

Badges

Open In Collab made-with-python Code style: black Docker Pulls Coverage Status interrogate

Features

Installation

Install SICAR with pip

pip install git+https://github.com/urbanogilson/SICAR

Prerequisite:

Google Tesseract OCR (additional info on how to install the engine on Linux, Mac OSX, and Windows).

Optional: PaddleOCR (additional info on how to install the engine on Linux, Mac OSX, and Windows).

If you don't want to install dependencies on your computer or don't know how to install them, we strongly recommend Google Colab.

Documentation

Usage/Examples

from SICAR import Sicar, State, Polygon
import pprint

# Create Sicar instance
car = Sicar()

# Get release data dates
state_dates = car.get_release_dates()
pprint.pprint(state_dates)
# {<State.AC: 'AC'>: '04/08/2024',
#  <State.AL: 'AL'>: '04/08/2024',
#  <State.AM: 'AM'>: '04/08/2024',
#  <State.AP: 'AP'>: '03/08/2024',
#  <State.BA: 'BA'>: '03/08/2024',
#  <State.CE: 'CE'>: '06/08/2024',
#  <State.DF: 'DF'>: '06/08/2024',
#  <State.ES: 'ES'>: '04/08/2024',
#  <State.GO: 'GO'>: '04/08/2024',
#  <State.MA: 'MA'>: '04/08/2024',
#  <State.MG: 'MG'>: '04/08/2024',
#  <State.MS: 'MS'>: '06/08/2024',
#  <State.MT: 'MT'>: '05/08/2024',
#  <State.PA: 'PA'>: '03/08/2024',
#  <State.PB: 'PB'>: '04/08/2024',
#  <State.PE: 'PE'>: '03/08/2024',
#  <State.PI: 'PI'>: '02/08/2024',
#  <State.PR: 'PR'>: '03/08/2024',
#  <State.RJ: 'RJ'>: '02/08/2024',
#  <State.RN: 'RN'>: '03/08/2024',
#  <State.RO: 'RO'>: '05/08/2024',
#  <State.RR: 'RR'>: '06/08/2024',
#  <State.RS: 'RS'>: '05/08/2024',
#  <State.SC: 'SC'>: '05/08/2024',
#  <State.SE: 'SE'>: '05/08/2024',
#  <State.SP: 'SP'>: '05/08/2024',
#  <State.TO: 'TO'>: '05/08/2024'}

# Download APPS polygon for the PA state
car.download_state(State.PA, Polygon.APPS)

OCR drivers

Optical character recognition (OCR) drivers are used to recognize characters in a captcha.

We currently have two options for automating the download process.

Tesseract OCR (Default)

from SICAR import Sicar, State, Polygon
from SICAR.drivers import Tesseract

# Create Sicar instance using Tesseract OCR
car = Sicar(driver=Tesseract)

# Download a state
car.download_state(State.SP, Polygon.LEGAL_RESERVE, folder='SICAR/SP')

PaddleOCR

Install SICAR with pip and include Paddle dependencies

pip install 'SICAR[paddle] @  git+https://github.com/urbanogilson/SICAR'
from SICAR import Sicar, State, Polygon
from SICAR.drivers import Paddle

# Create Sicar instance using PaddleOCR
car = Sicar(driver=Paddle)

# Download a state
car.download_state(State.AM, Polygon.CONSOLIDATED_AREA, folder='SICAR/AM')

Run with Google Colab

Using Google Colab, you don't need to install the dependencies on your computer and you can save files directly to your Google Drive.

Open In Collab

Run with Docker

Pull Image from Docker Hub urbanogilson/sicar

docker pull urbanogilson/sicar:latest

Run the downloaded Docker Image using an entry point (file) from your machine (host)

docker run -i -v $(pwd):/sicar urbanogilson/sicar:latest -<./examples/docker.py

Note: Update the entry point file ./examples/docker.py or create a new one to download data based on your needs.

or pass a script through STDIN

docker run -i -v $(pwd):/sicar urbanogilson/sicar:latest -<<EOF
from SICAR import Sicar, State, Polygon
from SICAR.drivers import Paddle

car = Sicar(driver=Paddle)

car.download_state(state='MG', polygon=Polygon.CONSOLIDATED_AREA, folder='MG')
EOF

Note: Using $(pwd) the container will save the download data into the current folder.

Optional: Make an external directory to store the downloaded data and use a volume parameter in the run command to point to it.

Data dictionary

Attribute Description
cod_estado Unit of the Federation in which the registration is located.
municipio Municipality in which the registration is located.
num_area Gross area of the rural property or the subject that makes up the registry, in hectare.
cod_imovel Registration number in the Rural Environmental Registry (CAR).
ind_status Status of registration in CAR, according to Normative Instruction no. 2, of May 6, 2014, of the Ministry of the Environment (https://www.car.gov.br/leis/IN_CAR.pdf), and the Resolution No. 3, of August 27, 2018, of the Brazilian Forest Service (https://imprensanacional.gov.br/materia/-/asset_publisher/Kujrw0TZC2Mb/content/id/38537086/do1-2018-08-28-resolucao-n-3-de-27-de-agos-de-2018-38536774), being AT - Active; PE - Pending; SU - Suspended; and CA - Cancelled.
des_condic Condition in which the registration is in the analysis flow by the competent body.
ind_tipo Type of Rural Property, being IRU - Rural Property; AST - Agrarian Reform Settlements; PCT - Traditional Territory of Traditional Peoples and Communities.
mod_fiscal Number of rural property tax modules.
nom_tema Name of the theme that makes up the registration (Permanent Preservation Area, Path, Remnant of Native Vegetation, Restricted Use Area, Administrative Easement, Legal Reserve, Hydrography, Wetlands, Consolidated Rural Area, Areas with Altitude Higher than 1800 meters, Areas with Slopes Higher than 45 degrees, Hilltops, Plateau Edges, Fallow Areas, Mangroves and Restinga).

Acknowledgements

Roadmap

Contributing

The development environment with all necessary packages is available using Visual Studio Code Dev Containers.

Open in Remote - Containers

Contributions are always welcome!

Feedback

If you have any feedback, please reach me at hello@gilsonurbano.com

License

MIT