glourencoffee / pycvm

Python library for processing data from CVM
MIT License
2 stars 0 forks source link

Generating balances from DFP of 2018 raises AccountLayoutError #5

Closed glourencoffee closed 2 years ago

glourencoffee commented 2 years ago

Description

Generating the balance sheet and/or the income statement from DFP documents read with dfpitr_reader() raise the exception AccountLayoutError.

Steps to reproduce

  1. Download the DFP file of 2018
  2. Read DFPITR documents from that file by calling dfpitr_reader()
  3. Generate the balance sheet of each document by calling BalanceSheet.from_dfpitr()
  4. Generate the income statement of each document by calling IncomeStatement.from_dfpitr()
  5. See exception

Expected behavior

BalanceSheet and IncomeStatement are generated.

Actual behavior

Exception AccountLayoutError is raised.

For example, when running the samples program print_balances.py and passing the DFP of 2018 as argument, it outputs:

...
=================================
BRB BCO DE BRASILIA S.A. (2018-12-31, versão: 1)
o DFP/ITR não tem balanço consolidado
=================================
BRB BCO DE BRASILIA S.A. (2018-12-31, versão: 2)
erro: invalid BPA or BPP: ["IndustrialBPAValidator: invalid account name 'Caixa e Equivalentes de Caixa' at index 1 (expected: 'Ativo Circulante')", "FinancialBPPValidator: invalid account name 'Passivos Financeiros ao Valor Justo através do Resultado' at index 1 (expected: 'Passivos Financeiros para Negociação')", "InsuranceBPAValidator: invalid account name 'Caixa e Equivalentes de Caixa' at index 1 (expected: 'Ativo Circulante')"]
=================================
CENTRAIS ELET BRAS S.A. - ELETROBRAS (2018-12-31, versão: 1)
...
glourencoffee commented 2 years ago

Analyzing this problem, I realized this is another nuisance coming from CVM's side. It turns out the BPP of 7 financial companies of 2018 is differing from the expected layout.

The layout of financial companies is expected to have the following accounts:

This is true for years from 2010 until 2019 (2020 and 2021 couldn't be tested because of callmegiorgio/pycvm#9).

However, there is a second layout for financial companies in the DFP of 2018, a layout that has the same accounts as above, with the difference that accounts 2.01 and 2.02 have other names:

This second layout is given by the following companies:

I could solve this problem by making two layouts of financial companies for the year of 2018, but I think it should be tackled using a different approach.

Ideally, the type of layout (financial, industrial, or insurance) used by a company would be given in DFP/ITR files, but since they lack this information, this library may instead have its own internal mapping of companies and their layout types. For example, the company Banco do Brasil ("BCO BRASIL") always uses financial layouts. Thus, instead of trying to find which layout "BCO BRASIL" uses, this library would already know it uses the financial layout and would validate whether the expected accounts are there. This also allows the validation process to compare only account codes and ignore account names, which indirectly solves this issue, since the only problem going on here is the difference in names.

The disadvantage of the above approach is that this library will need to be updated as new companies become public, but this seems better than work-arounding layout names... argh.