excieve / dragnet

Catching the big fish
MIT License
2 stars 1 forks source link

add flag "estate cost missing" #22

Open pro100olga opened 6 years ago

pro100olga commented 6 years ago

Условие: в json-е есть объекты недвижимости, для которых и data$step_3$costDate, и data$step_3$costAssessment равны нулю или “Не відомо”, “Не застосовується”, “Член сім’ї не надав інформацію” (в принципе любые буквы). Важно, чтобы такими были оба показателя (costDate и costAssessment) для одного и того же объекта. Если есть хотя бы один такое объект, считаем, что условие выполняется и остальные объекты не проверяем Записать в переменную estate.has_hidden_cost (значения True / False)

pro100olga commented 6 years ago

There are some cases for which value is False, but should be True.

Examples id: nacp_534fefb2-9d1d-46b5-8a82-e19f5c68239a, nacp_5355c2bd-e222-4919-8a28-a1c6900cf4d9

Both costDate and costAssessment here are equal to 0 - so flag should be True

pro100olga commented 4 years ago

Checked on partial data as of 24/02/20 (~20K rows) All ok, but offer to limit data of receiving estate to the last 5 years (otherwise flag is raised for almost half of declarations)

pro100olga commented 4 years ago

Checked on partial data as of 01/03/20 (~70K rows)

Checked 10 files

1) nacp_00c0df8c-9cd2-4a17-a800-5973fdc18537 Data of acquiring rights is unknown, still flag is set to True

dchaplinsky commented 4 years ago
untitled — new_flags_sample 2020-03-01 16-13-59

It's set but in API only. Do you know what owningDate_extendedstatus means?

dchaplinsky commented 4 years ago

https://public-api.nazk.gov.ua/v1/declaration/00c0df8c-9cd2-4a17-a800-5973fdc18537

pro100olga commented 4 years ago
untitled — new_flags_sample 2020-03-01 16-13-59

It's set but in API only. Do you know what owningDate_extendedstatus means?

I do not know for sure, but from observations I see that status variables show different reasons for missing information (Конфіденційна інформація \ Член сім'ї не надав інформацію \ Невідомо, etc). Value 3 should mean "Член сім'ї не надав інформацію"

However, in this declaration there is mismatch in json: there is owning date and at the same time states that it is not provided. And html shows that date is not provided

I guess it is some error in file. So it is not a problem of the algorithm.

Still need to check on the full dataset

pro100olga commented 4 years ago

Checking on annual declarations from 2018-2019

Should be True, but is False: nacp_a6ec5ed6-d244-44b5-b7c4-3b9e9739b367 nacp_a6d9882c-edd9-4391-ad94-fb0afbd748f4 nacp_10b21a17-16ff-4f6c-b365-fc498897d12b nacp_10c128fc-1173-4400-9a4b-8e2478f739a1 nacp_a7a75fc3-e798-437d-a3e4-ce2af1da6d4c

Also, is it last 5 years including year of declaration? Like if it is 2018 decl, is 2013 included in last 5 years?

pro100olga commented 4 years ago

@dchaplinsky When checking for last 5 years, is the "oldest" year included? Eg, if declaration is for 2017 year, is 2012 included in last 5 years?

dchaplinsky commented 4 years ago

owning_year + 5 >= parseInt(nacp_doc.step_0.declarationYear1) It is

On Tue, Mar 24, 2020 at 10:00 PM Olga Makarova notifications@github.com wrote:

@dchaplinsky https://github.com/dchaplinsky When checking for last 5 years, is the "oldest" year included? Eg, if declaration is for 2017 year, is 2012 included in last 5 years?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/excieve/dragnet/issues/22#issuecomment-603475145, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABAA4UT7ZNUVPVW2IZDGZLRJEGPTANCNFSM4ERSWJKA .

dchaplinsky commented 4 years ago

All of examples above has ownership type "rent" or "usage right" or "owner is a third party"

On Wed, Mar 18, 2020 at 10:25 PM Olga Makarova notifications@github.com wrote:

Checking on annual declarations from 2018-2019

Should be True, but is False: nacp_a6ec5ed6-d244-44b5-b7c4-3b9e9739b367 nacp_a6d9882c-edd9-4391-ad94-fb0afbd748f4 nacp_10b21a17-16ff-4f6c-b365-fc498897d12b nacp_10c128fc-1173-4400-9a4b-8e2478f739a1 nacp_a7a75fc3-e798-437d-a3e4-ce2af1da6d4c

Also, is it last 5 years including year of declaration? Like if it is 2018 decl, is 2013 included in last 5 years?

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/excieve/dragnet/issues/22#issuecomment-600841437, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABAA4UE66P3QCFPPFL3LUDRIEU3RANCNFSM4ERSWJKA .

pro100olga commented 4 years ago

As soon as we have #45 flag, which serves the goal to find hidden estate cost, I suggest to remove estate_has_hidden_cost column from calculation.