excieve / dragnet

Catching the big fish
MIT License
2 stars 1 forks source link

add flag "has_major_real_estate" #40

Closed pro100olga closed 4 years ago

pro100olga commented 4 years ago

There is real estate object (житлова нерухомість) with > 300 sq.m. space (in any form of owning)

Final description of the flag (written before closing the issue): Check all estate objects of types: 'Житловий будинок', 'Квартира', 'Кімната', 'Садовий (дачний) будинок' + "Інше" if there is a word "квартира" in description If the area of the object is > 300 sq.m. - raise the flag Owning rights are not considered here.

pro100olga commented 4 years ago

Checked on partial data (67K rows) 10 files checked - all ok

Need to check on full dataset

pro100olga commented 4 years ago

How object type is defined here?

E.g. in this declaration https://declarations.com.ua/declaration/nacp_d9c65b76-d4f2-40d6-8139-000101e5993e we have 1 object with type Інше - Комерційне приміщення and flag is False (ok)

but here https://declarations.com.ua/declaration/nacp_a9d56804-2c43-42e1-90da-675bfaabd285 we have 1 object (with appropriate area except land) with type Інше - Адміністративний будинок and flag is True

Is other field somehow processed, and if yes - how? @dchaplinsky

dchaplinsky commented 4 years ago
            case 'other':
                if

(estate_doc.otherObjectType.toLowerCase().indexOf("будинок") != -1) has_real_estate = true;

On Thu, Mar 19, 2020 at 10:15 PM Olga Makarova notifications@github.com wrote:

How object type is defined here?

E.g. in this declaration

https://declarations.com.ua/declaration/nacp_d9c65b76-d4f2-40d6-8139-000101e5993e we have 1 object with type Інше - Комерційне приміщення and flag is False (ok)

but here

https://declarations.com.ua/declaration/nacp_a9d56804-2c43-42e1-90da-675bfaabd285 we have 1 object (with appropriate area except land) with type Інше - Адміністративний будинок and flag is True

Is other field somehow processed, and if yes - how?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/excieve/dragnet/issues/40#issuecomment-601394975, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABAA4QTH4ABRY2RUBVTDL3RIJ4MPANCNFSM4K7VQBEA .

pro100olga commented 4 years ago

What about objectType? Is something besides "Земельна ділянка" excluded?

dchaplinsky commented 4 years ago

https://github.com/excieve/dragnet/blob/a95a58fc5bd2f99651cc8cf679007c8b16411969/views/red_flags.es6#L909

On Fri, Mar 27, 2020 at 11:13 PM Olga Makarova notifications@github.com wrote:

What about objectType? Is something besides "Земельна ділянка" excluded?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/excieve/dragnet/issues/40#issuecomment-605317450, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABAA4UCDARRYE4Y5NPARDDRJUJGHANCNFSM4K7VQBEA .

pro100olga commented 4 years ago

I suggest that we use objectType w\o referring to this encoding, and consider the following types: 'Житловий будинок', 'Квартира', 'Кімната', 'Садовий (дачний) будинок'

This excludes Офіс, Гараж, Земельна ділянка

dchaplinsky commented 4 years ago

Should be fixed now.

pro100olga commented 4 years ago

Checked on ~500K documents from annual 2017 decl ~600 mismatches

False, but should be True: nacp_4baa4df7-4d69-4d6b-9d29-79814b2aff1e - is there a filter for owning type? If so, I would suggest to remove it, as renting large apartment is also interesting nacp_e8fc6c5c-7fb8-4cec-83a8-55f16afd96b0 nacp_e83f1860-fe8b-471a-92e6-6f7414978b85

True, but should be False: nacp_85d9625b-2d51-4145-99a7-a913416c7546 nacp_b487d183-d440-41f0-82fc-546fae413726 nacp_f8fbcc7c-ed80-495d-8d3d-8013de2e6ba1

dchaplinsky commented 4 years ago

Fixed, will send shortly

On Wed, Apr 1, 2020 at 9:57 PM Olga Makarova notifications@github.com wrote:

Checked on ~500K documents from annual 2017 decl ~600 mismatches

False, but should be True: nacp_4baa4df7-4d69-4d6b-9d29-79814b2aff1e - is there a filter for owning type? If so, I would suggest to remove it, as renting large apartment is also interesting nacp_e8fc6c5c-7fb8-4cec-83a8-55f16afd96b0 nacp_e83f1860-fe8b-471a-92e6-6f7414978b85

True, but should be False: nacp_85d9625b-2d51-4145-99a7-a913416c7546 nacp_b487d183-d440-41f0-82fc-546fae413726 nacp_f8fbcc7c-ed80-495d-8d3d-8013de2e6ba1

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/excieve/dragnet/issues/40#issuecomment-607431857, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABAA4U66INJWLIGJYUOQS3RKOFAFANCNFSM4K7VQBEA .

pro100olga commented 4 years ago

Checked 6 mismatches listed above.

First 3 (False but should be True) are ok

But the others (True, but should be False) are still True: True, but should be False: nacp_85d9625b-2d51-4145-99a7-a913416c7546 nacp_b487d183-d440-41f0-82fc-546fae413726 nacp_f8fbcc7c-ed80-495d-8d3d-8013de2e6ba1

dchaplinsky commented 4 years ago

Should be fixed now

On Tue, Apr 14, 2020 at 10:50 PM Olga Makarova notifications@github.com wrote:

Checked 6 mismatches listed above.

First 3 (False but should be True) are ok

But the others (True, but should be False) are still True: True, but should be False: nacp_85d9625b-2d51-4145-99a7-a913416c7546 nacp_b487d183-d440-41f0-82fc-546fae413726 nacp_f8fbcc7c-ed80-495d-8d3d-8013de2e6ba1

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/excieve/dragnet/issues/40#issuecomment-613647223, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABAA4VKISHZV2TRGHSFJU3RMS5BJANCNFSM4K7VQBEA .

pro100olga commented 4 years ago

Checked on 500K documents (annual declarations from 2017) using this code

All ok. Closed.

Final description of the flag: Check all estate objects of types: 'Житловий будинок', 'Квартира', 'Кімната', 'Садовий (дачний) будинок' + "Інше" if there is a word "квартира" in description If the area of the object is > 300 sq.m. - raise the flag Owning rights are not considered here.