excieve / dragnet

Catching the big fish
MIT License
2 stars 1 forks source link

add flag "has_non_bank_liabilities" #38

Closed pro100olga closed 4 years ago

pro100olga commented 4 years ago

Description: According to https://github.com/excieve/dragnet/issues/31 using edrpou codes and these words in company name: "твбв", "ощадбанк", "приватбанк", "аваль", "райффайзен", "абанк", "агріколь", "укрсиббанк", "альфабанк", "пумб", "укргазбанк", "мегабанк", "акордбанк", "сбербанк", "таскомбанк", "кредобанк", "індустріалбанк", "укрексімбанк", "радабанк", "укркомунбанк", "укрбудінвестбанк", "правексбанк", "правекс", "прокредит", "метабанк", "комінвестбанк", "форвард",

Final description of the flag (written before closing issue): Check all financial obligations. Raise the flag if there is an object for which:

pro100olga commented 4 years ago

Checked on partial data (67K rows) 10 files checked

1) nacp_00c0c49c-f7dd-42b2-b971-e5dca23a4f6b Has EDRPOU 34047020 (Дельта-банк), but flag is set to True

dchaplinsky commented 4 years ago
Декларації: Осипенко Олеся Миколаївна від НАЗК 2020-03-12 00-42-57

Well.

pro100olga commented 4 years ago

Ok, need to check on full dataset

pro100olga commented 4 years ago

Checked on annual 2017 declarations (~800K) using this code: https://github.com/pro100olga/declarations/tree/master/bi/red_flags/flags_03_2020

@dchaplinsky Please add to the list of bank names: "укрсоцбанк"

Also, do you use the list of banks' edrpou codes? See this documents for example: nacp_00c80022-4f89-4c38-9461-b7dcac085b21 nacp_04f51beb-a7f9-42c2-8d94-998142567ab7 nacp_2eb2c806-cd3d-423d-aa65-9dbbcf6ee296 Flag is set to True, while should be False

Also, see here - flag is set to False, while should be True: nacp_4b2fcafe-c18f-45e4-aa92-b13997b1534a

dchaplinsky commented 4 years ago

re: nacp_4b2fcafe-c18f-45e4-aa92-b13997b1534a Why should it be? last 3 records are about insurance.

Also, could you provide me some true positives for this one please?

On Tue, Mar 24, 2020 at 10:18 PM Olga Makarova notifications@github.com wrote:

Checked on annual 2017 declarations (~800K) using this code:

https://github.com/pro100olga/declarations/tree/master/bi/red_flags/flags_03_2020

@dchaplinsky https://github.com/dchaplinsky Please add to the list of bank names: "укрсоцбанк"

Also, do you use the list of banks' edrpou codes? See this documents for example: nacp_00c80022-4f89-4c38-9461-b7dcac085b21 nacp_04f51beb-a7f9-42c2-8d94-998142567ab7 nacp_2eb2c806-cd3d-423d-aa65-9dbbcf6ee296 Flag is set to True, while should be False

Also, see here - flag is set to False, while should be True: nacp_4b2fcafe-c18f-45e4-aa92-b13997b1534a

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/excieve/dragnet/issues/38#issuecomment-603482810, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABAA4SNWXGELYXNSLX3E2DRJEIQTANCNFSM4K7OQSRQ .

pro100olga commented 4 years ago

Checked on partial data of annual decl-s for 2017 (~18K). 1 mismatch

True, but should be False:

dchaplinsky commented 4 years ago

Should be fixed now.

pro100olga commented 4 years ago

Checked on ~500K documents from annual 2017 decl ~300 mismatches, all False, but should be True

nacp_eb116aaa-d89d-4ec4-b236-c7b24e5f8101 nacp_efeadb4e-69c9-47ea-8b2e-bd0604a16eee - here bank name is "альфа банк", but the list of banks names has "альфабанк", so the result should be True (non bank object). nacp_efd289a8-74b9-4b9a-8cba-6f7d7f49504e

dchaplinsky commented 4 years ago

Again, I've checked it on a fresh copy and all of them are good. I've added Альфа банк to the list tho

On Wed, Apr 1, 2020 at 9:40 PM Olga Makarova notifications@github.com wrote:

Checked on ~500K documents from annual 2017 decl ~300 mismatches, all False, but should be True

nacp_eb116aaa-d89d-4ec4-b236-c7b24e5f8101 nacp_efeadb4e-69c9-47ea-8b2e-bd0604a16eee - here bank name is "альфа банк", but the list of banks names has "альфабанк", so the result should be True (non bank object). nacp_efd289a8-74b9-4b9a-8cba-6f7d7f49504e

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/excieve/dragnet/issues/38#issuecomment-607423840, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABAA4T4T2ISE754BFDVSJLRKODA7ANCNFSM4K7OQSRQ .

pro100olga commented 4 years ago

Checked 3 mismatches mentioned above - all ok. Will check on full dataset later

pro100olga commented 4 years ago

Checked on 500K documents (annual declarations from 2017) using this code

All ok. Closed.

Final description of the flag: Check all financial obligations. Raise the flag if there is an object for which: