io-developer / php-whois

PHP WHOIS provides parsed and raw whois lookup of domains and ASN routes. PHP 8.0 compatible (5.4+ old versions)
MIT License
440 stars 118 forks source link

Remove dash from states #86

Closed bessone closed 5 years ago

bessone commented 5 years ago
Q A
Branch? master
Bug fix? no
New feature? no
BC breaks? no
Deprecations? no
Tests pass? yes
Fixed tickets #80
License MIT

Small change to the regex to handle the dash.

io-developer commented 5 years ago

Hi. What kind of responses contains dash in states? Don't see on issue examples... image image

io-developer commented 5 years ago

Anyway. I guess better to trim result for spaces and dash instead of fix regex Looks like more reliable. What you think?

bessone commented 5 years ago

As @deepika-maj said in #80, try the tessaana.com domain image

bessone commented 5 years ago

I have to try some domains, but I'm pretty sure some registry don't have states in a single string, but they are sentences with spaces. Removing all the text after the first space with a regex is not the solution, but surely you can think of a better regexp, but you should have more cases of answers to be sure. My idea is that a small change so far may suffice for a refactoring in the future.

io-developer commented 5 years ago

Not exactly. My point si keep regex as is in trim(preg_replace('~\(.+?\)|(http|<a href).+~ui', '', $state))

'someStatus - <a href=blablabla>' --> 'someStatus - ' Now we just trim remainder (spaces and other junk) trim('someStatus - ', ' -\t') => 'someStatus'

So fix become to trim(preg_replace('~\(.+?\)|(http|<a href).+~ui', '', $state), ' -\t')

bessone commented 5 years ago

Ok, understood! I think is pretty similar, remove the dash with the regex or remove it later with a trim. Maybe do it in the regex is a bit faster (ok, we talk about micro/milliseconds.... :D) I think we can trim the spaces in the regex also.

io-developer commented 5 years ago

Well, regex contains 2 parts and dash only in second. So if we got something like 'myStatus - (bla bla bla)' it becomes to 'myStatus - ' So trim can handle that case and keep regex more readable. I mean only that (not slower/faster) :D Anyway we have only one example so no matter what fix will be pushed

bessone commented 5 years ago

For me is the same, as you prefer! (and meanwhile I look forward to the day that RDAP will become a standard used by all)