Closed antalgu closed 1 week ago
I made the changes for FR and it seems to be capturing the H:M:S information now. I couldn't replicate the same for KR, but made the change anyway. It looks like only the date is included in KR responses (with find_authoritative_server=False
). For example, here's the output for google.kr
:
query : google.kr
# KOREAN(UTF8)
도메인이름 : google.kr
등록인 : 구글코리아유한회사
등록인 주소 : 서울시 강남구 역삼동 737 강남파이낸스센터 22층
등록인 우편번호 : 135984
책임자 : Domain Administrator
책임자 전자우편 : dns-admin@google.com
책임자 전화번호 : 82.25319000
등록일 : 2007. 03. 02.
최근 정보 변경일 : 2010. 10. 04.
사용 종료일 : 2025. 03. 02.
정보공개여부 : Y
등록대행자 : (주)후이즈(http://whois.co.kr)
DNSSEC : 미서명
1차 네임서버 정보
호스트이름 : ns1.google.com
2차 네임서버 정보
호스트이름 : ns2.google.com
네임서버 이름이 .kr이 아닌 경우는 IP주소가 보이지 않습니다.
# ENGLISH
Domain Name : google.kr
Registrant : Google Korea, LLC
Registrant Address : 22nd Floor Gangnam Finance Center, 737 Yeoksam-dong Kangnam-ku Seoul
Registrant Zip Code : 135984
Administrative Contact(AC) : Domain Administrator
AC E-Mail : dns-admin@google.com
AC Phone Number : 82.25319000
Registered Date : 2007. 03. 02.
Last Updated Date : 2010. 10. 04.
Expiration Date : 2025. 03. 02.
Publishes : Y
Authorized Agency : Whois Corp.(http://whois.co.kr)
DNSSEC : unsigned
Primary Name Server
Host Name : ns1.google.com
Secondary Name Server
Host Name : ns2.google.com
- KISA/KRNIC WHOIS Service -
Nice, thanks for the fast response!
The whois from klein-sujka.fr with find_authoritative_server = False:
Returns empty dates.
That is the case because the current regex "created: (\d{4}-\d{2}-\d{2})" tries to match this where there is only one space between created: and the date. Adding \s+ would allow matching one or more whitespace characters, solving this problem, so the regex would end up as:
"created:\s+(\d{4}-\d{2}-\d{2})" (and the same for the other dates)
However, this would ignore the time and only get the date. I don't know if this was a regex done to be able to capture the french dates when there were a bit more problems converting the format to datetime but if this was the case it could maybe be updated to:
"created: *(.+)"
As i've tested 6 other .fr webpages and they all return their dates in this format 2015-03-06T13:41:26Z, which will then correctly converted to a datetime. (this can also be done with last-update and Expiry date)
Also, on the same topic, I've noticed that this regex of selecting only the date and omitting the time was also done for KR, maybe it could also be applied there, but I don't have the time to do tests.