Closed Kotzly closed 3 years ago
It is possible to access the Monitoring Panels to create tables and visualizations that are almost analysis-ready. These tables contain the selected information and in most cases have different options for aggregating information.
Other option is TABNET, where the information is also online, but there are many options for aggregating, selecting and filtering data. The following images show where to access this information, and an example of how the data is presented.
TABNET data can be download as .csv
or .tab
, to be loaded by TABWIN.
This data can also be download in a more raw format, in this link by acessing the TabWin option in the DataSUS website. This will load the File Transfer page. In this page you download all types of data, their documentation (with data dictionaries) and more.
In this example we won't use TABNET, but we will use the dbf2dbc.exe
program that comes with it.
These are the steps to download it:
Follow the same steps, but select "Documentação" instead of "Programas" to download TABWIN's documentation.
In this same page you can download the data:
Example:
The files come in .dbc
format, which appear to be a compressed database format. First we need to deflate this compressed file. This can be done using the dbf2dbc.exe
program that comes with TABWIN, or can be downloaded here. This program will decompress each .dbc
file to the .dbf
format, which can be loaded with TABWIN or with Python using the simpledbf package. More information about the dbf2dbc tool can be found here.
To use the dbf2dbc
tool, first unzip the data that you downloaded to a folder. This folder can contain one or many dbc
files. Let's suppose this folder is at C:\Users\Joao\Documents\data
. Now open command line (cmd or powershell), go to the folder where dbf2dbc
is at and run the command:
dbf2dbc.exe "C:\Users\Joao\Documents\data\*.dbc" "C:\Users\Joao\Documents\data"
This will uncompress the dbc
files to .dbf
files and will save them in the same folder. You can change the second argument to save the files elsewhere.
The .dbf
file can be loaded within TABWIN, or you can load them into Python using the simpledbf package. The following sample code is in Python and loads the C:/Users/Joao/data/DNSP2018.dbf
file, transforms it to a pandas dataframe, and also saves it in csv
format in the same folder.
import pandas as pd
from simpledbf import Dbf5
filepath = "C:/Users/Joao/data/DNSP2018.dbf"
dbf = Dbf5(filepath)
df = dbf.to_dataframe()
dbf.to_csv("C:/Users/Joao/data/DNSP2018.csv")
DATASUS is an online platform that enables the access to public health data information from Brazil. The goal of this issue is explaining how to access, download and pre-process the database files.