open-contracting / cardinal-rs

Measure red flags and procurement indicators using OCDS data
https://cardinal.readthedocs.io
MIT License
9 stars 3 forks source link

R042: Describe an abnormal contract detail (address, phone number) #88

Open jpmckinney opened 1 year ago

jpmckinney commented 1 year ago

Needed to calculate indicator.

Camilamila commented 1 year ago

For this we can check the phone, or address. I dont think Ecuador has an specific address format.

jpmckinney commented 1 year ago

Aha, but what makes a phone number (or address) "abnormal" in Ecuador?

The notebook just suggests: len(parties/contactPoint/telephone) ≠ len(validtelephone)

jpmckinney commented 9 months ago

@Camilamila Please see my question above.

Camilamila commented 9 months ago

For address, I asked SERCOP to verify the address field in the BI, however it seems there is no standard way of writing addresses, so we might not be able to use this. For phone numbers, in Ecuador phone numbers have 8 digits if it is a land line and 9 digits if its a mobile. If it starts with 9 is a mobile number. The international area code is 593. So the rule could be:
if (parties/contactPoint/telephone) starts_with 9 & len(parties/contactPoint/telephone) = 9 then valid phone if (parties/contactPoint/telephone) starts_with between 2 and 7 & len(parties/contactPoint/telephone) = 8 then valid phone if (parties/contactPoint/telephone) starts_with 5939 & len(parties/contactPoint/telephone) = 12 then valid phone if (parties/contactPoint/telephone) starts_with 593(2-7) & len(parties/contactPoint/telephone) = 11 then valid phone else abnormal phone

jpmckinney commented 9 months ago

Thanks - assigning @yolile as another option for a first indicator (besides open-contracting/bi.open-contracting.org#121).

yolile commented 5 months ago

@jpmckinney How generic do we want the rules configuration to be?

jpmckinney commented 5 months ago

Probably add the regex crate and allow regex configuration e.g.

[R042]
telephone = ^((593)?([2-7]|9\d)\d{7})$

And we would pre-process by removing non-number characters. (I think there are optional zeroes in some countries at least – not sure if there's a consistent rule for removing those, or if we need to add that to the regex.)

Another (more robust) option is to use https://github.com/whisperfish/rust-phonenumber (Python version is used in Pelican backend). In that case we only configure with the allowed prefixes as a comma-separated list, and we'd need some logic to try each prefix if the number doesn't include the prefix (though correct OCDS should).

That said, does EC even have telephone numbers? When I checked last year, they did not have telephone or faxNumber fields: https://docs.google.com/spreadsheets/d/16Xud96U38Nf3Pz-nyaaZcvNe6_IyqjVAcsRmI81tuLk/edit#gid=1052318632

yolile commented 5 months ago

That said, does EC even have telephone numbers?

Ah, you are right, they don't!

jpmckinney commented 5 months ago

Okay, I'm removing this from the EC milestone, as we have no phone numbers, and no methodology for addresses.