cloudflare / cfrpki

Cloudflare's RPKI Toolbox
https://rpki.cloudflare.com
BSD 3-Clause "New" or "Revised" License
177 stars 44 forks source link

APNIC rsync failback and LANIC AS0 #104

Open sarasalingam opened 2 years ago

sarasalingam commented 2 years ago

In the cloudflare OctoRPKI, we have disabled the rrdp failover "-rrdp.failover=false" but still for few URLs its going for the rsync (For APNIC)

Please find below log for the same –

TYPE rsync_errors gauge

rsync_errors {address="rsync://rpki-repository.nic.ad.jp/ap"} 146 rsync_errors{address="rsync://rpki.apnic.net/repository"} 146

Could you please help us to identify the possible issue? We understand why is the cause for JPNIC but not for APNIC ? Even though rsync fails, the ROA counts are correct for APNIC compared to the public sites since it is downloading via RRDP.

Nov 19 08:34:16 rpki01 bbe10dbee28e[1531]: time="2021-11-18T21:34:16Z" level=info msg="RRDP: Downloading root notification https://rrdp.apnic.net/notification.xml" Nov 19 08:34:16 rpki01 bbe10dbee28e[1531]: time="2021-11-18T21:34:16Z" level=info msg="RRDP: https://rrdp.apnic.net/notification.xml has 3 deltas to parse (cur: 95753, last: 95750)" Nov 19 08:37:00 rpki01 bbe10dbee28e[1531]: time="2021-11-18T21:37:00Z" level=info msg="RRDP sync https://rrdp.sub.apnic.net/notification.xml" Nov 19 08:37:00 rpki01 bbe10dbee28e[1531]: time="2021-11-18T21:37:00Z" level=info msg="RRDP: Downloading root notification https://rrdp.sub.apnic.net/notification.xml" Nov 19 08:37:00 rpki01 bbe10dbee28e[1531]: time="2021-11-18T21:37:00Z" level=info msg="RRDP: https://rrdp.sub.apnic.net/notification.xml has 0 deltas to parse (cur: 1696, last: 1696)"

Also we have noticed, OctoRPKI is not fetching the data for LANIC-AS0 tal. Please advise why it is working for APNIC AS0 not for LANIC AS0.

100 17736 17714 66 Nov18 ? 1-11:56:46 ./octorpki -tal.root=tals/afrinic.tal,tals/apnic.tal,tals/arin.tal,tals/lacnic.tal,tals/ripe.tal,tals/lacnic-as0.tal -tal.name=AFRINIC,APNIC,ARIN,LACNIC,RIPE,LACNIC-AS0 -output.sign=false -rrdp.failover=false -refresh=600 100 17896 17876 6 Nov18 ? 03:43:42 ./gortr -loglevel debug -refresh 600 -rtr.refresh 600 -slurm /configs/slurm.json -ssh.bind :8282 -ssh.key private_new.pem -ssh.method.password=true -ssh.auth.user rpki -ssh.auth.password rpki -bind :8283 -cache http://octorpk:8081/output.json -verify=false

ties commented 2 years ago

What version of octorpki are you running - especially if using docker (since the public image is not up to date)?

ties commented 2 years ago

And for the LACNIC AS0 tal: Please check the content of the tal file. I had to manually add line breaks.

sarasalingam commented 2 years ago

Hi ties,

We are running version 1.2.2. We have noticed version 1.3.0 was running extremely slow in the LAB and pre-production. The behaviour with rsync was same for APNIC.

Debashish,

Can you please provide further comments the about the Docker image ?

regards, Skanda Arasalingam

sarasalingam commented 2 years ago

Hi Ties,

The same TAL is working for routinator. We checked the file size, for spaces and any special characters. Not sure, what do you mean by adding line breaks ?

RPKI RPKI Trust Anchor (lacnic.net)https://www.lacnic.net/4984/2/lacnic/rpki-rpki-trust-anchor

@.***

Regards, Skanda Arasalingam

sarasalingam commented 2 years ago

@.*** tals]$ cat lacnic-as0.tal https://rrdp.lacnic.net/ta/rta-lacnic-rpki-as0.cer rsync://repository.lacnic.net/rpkias0/lacnic/rta-lacnic-rpki-as0.cer

MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAhW5FgZ9Foda5ZpboK99IzhnBG4Gu9t0M bzaqUI7rEH70RKbxpYtBguktrwVX3CaK7BiDtxOEtQv6iikt2DyfLZ14tpwoh/1NBqPilb+PfvNC N75LU9WYv5Fy651bC+N9kO7tAZeWY1NhZCYi3FjFjBRvv7IbUuWx5Us+xoV0g1jVVI5PI69Cbp/j 1a3CutCe92yJ5z9VTJQYXPw32ti0gAAERCepr21y4sO4rJiJtdDGk2+ezFzSgvgitX+/aqaoTpsD HCcSu0ScdsuY+XIQuq0f/Pcg/ClwSmRX2M+7nsbiOHv0GP4VubEW14u9lvu+XdpaPcZVBRldaP9h 5I1f2QIDAQAB @.*** tals]$

From: Skanda Arasalingam Sent: Wednesday, 24 November 2021 8:25 PM To: cloudflare/cfrpki @.>; cloudflare/cfrpki @.> Cc: Author @.>; Prajakta Yuvraj patil @.>; Debashish Mukherjee @.***> Subject: RE: [cloudflare/cfrpki] APNIC rsync failback and LANIC AS0 (Issue #104)

Hi Ties,

The same TAL is working for routinator. We checked the file size, for spaces and any special characters. Not sure, what do you mean by adding line breaks ?

RPKI RPKI Trust Anchor (lacnic.net)https://www.lacnic.net/4984/2/lacnic/rpki-rpki-trust-anchor

@.***

Regards, Skanda Arasalingam -email%26utm_medium%3Demail%26utm_source%3Dgithub>.

ties commented 2 years ago

Hi Skanda,

I see the same whitespace that was in my tal file in that file (for example, before "5l1f"). It looks like OctoRPKI supports line breaks but not spaced within the lines.

# in your paste, there are whitespaces within a line:
...
1a3CutCe92yJ5z9VTJQYXPw32ti0gAAERCepr21y4sO4rJiJtdDGk2+ezFzSgvgitX+/aqaoTpsD HCcSu0ScdsuY+XIQuq0f/Pcg/ClwSmRX2M+7nsbiOHv0GP4VubEW14u9lvu+XdpaPcZVBRldaP9h 5I1f2QIDAQAB
# versus
$ cat ./lacnic-as0.tal
https://rrdp.lacnic.net/ta/rta-lacnic-rpki-as0.cer
rsync://repository.lacnic.net/rpkias0/lacnic/rta-lacnic-rpki-as0.cer

MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAhW5FgZ9Foda5ZpboK99IzhnBG4Gu9t0M
bzaqUI7rEH70RKbxpYtBguktrwVX3CaK7BiDtxOEtQv6iikt2DyfLZ14tpwoh/1NBqPilb+PfvNC
N75LU9WYv5Fy651bC+N9kO7tAZeWY1NhZCYi3FjFjBRvv7IbUuWx5Us+xoV0g1jVVI5PI69Cbp/j
1a3CutCe92yJ5z9VTJQYXPw32ti0gAAERCepr21y4sO4rJiJtdDGk2+ezFzSgvgitX+/aqaoTpsD
HCcSu0ScdsuY+XIQuq0f/Pcg/ClwSmRX2M+7nsbiOHv0GP4VubEW14u9lvu+XdpaPcZVBRldaP9h
5I1f2QIDAQAB

If you edit it such as in the attached copy it works for me: lacnic-as0.tal.zip

ties commented 2 years ago

I spotted this by running with -loglevel debug:

octorpki_1                   | time="2021-11-24T10:12:03Z" level=info msg="Validator started"
octorpki_1                   | time="2021-11-24T10:12:03Z" level=info msg="Serving HTTP on :8081/output.json"
octorpki_1                   | time="2021-11-24T10:12:03Z" level=debug msg="Fetching /tals/lacnic-as0.tal->/tals/lacnic-as0.tal"
octorpki_1                   | time="2021-11-24T10:12:03Z" level=info msg="Still exploring. Revalidating now"
octorpki_1                   | time="2021-11-24T10:12:03Z" level=error msg="file error for certificate: illegal base64 data at input byte 76"
octorpki_1                   | time="2021-11-24T10:12:03Z" level=debug msg="Fetching /tals/lacnic-as0.tal->/tals/lacnic-as0.tal"
octorpki_1                   | time="2021-11-24T10:12:03Z" level=info msg="Stable state. Revalidating in 10m0s"
octorpki_1                   | time="2021-11-24T10:12:03Z" level=error msg="file error for certificate: illegal base64 data at input byte 76"
sarasalingam commented 2 years ago

Hi Skanda,

Please find the attached Dockerfile for your reference.

Also, regarding the issue that we are currently facing for APNIC in Cloudflare. We have disabled the rrdp.failover please find below for same.

command: "-tal.root=tals/afrinic.tal,tals/apnic.tal,tals/arin.tal,tals/lacnic.tal,tals/ripe.tal,tals/apnic-as0.tal,tals/lacnic-as0.tal -tal.namehttps://protect2.fireeye.com/v1/url?k=1dcef56c-4255cc69-1dcec7a1-86d2114eab2f-c2272b6693ff66bf&q=1&e=cdd172cd-3250-47b9-99a4-5181e6487606&u=http%3A%2F%2Ftal.name%2F=AFRINIC,APNIC,ARIN,LACNIC,RIPE,APNIC-AS0,LACNIC-AS0 -output.sign=false -rrdp.failover=false -refresh=10m -loglevel=debug"

But, validator is still fetching the rsync for apnic. Please find below and also please find the attached log file (rpki01.meb) – Nov 25 15:10:58 rpki01 9cdbcb392fe3[1318]: time="2021-11-25T04:10:58Z" level=debug msg="Fetching rsync://rpki.sub.apnic.net/repository/A9114E750000/0/0108398CA988382C2A509BFDB39E146A76CF9DE0.mft->cache/rpki.sub.apnic.net/repository/A9114E750000/0/0108398CA988382C2A509BFDB39E146A76CF9DE0.mft"

We have checked the TAL file for APNIC as well the rsync is after the https link. Below is the TAL file that we are currently using.

https://rpki.apnic.net/repository/apnic-rpki-root-iana-origin.cer rsync://rpki.apnic.net/repository/apnic-rpki-root-iana-origin.cer

MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAx9RWSL61YAAYumEiU8z8 qH2ETVIL01ilxZlzIL9JYSORMN5Cmtf8V2JblIealSqgOTGjvSjEsiV73s67zYQI 7C/iSOb96uf3/s86NqbxDiFQGN8qG7RNcdgVuUlAidl8WxvLNI8VhqbAB5uSg/Mr LeSOvXRja041VptAxIhcGzDMvlAJRwkrYK/Mo8P4E2rSQgwqCgae0ebY1CsJ3Cjf i67C1nw7oXqJJovvXJ4apGmEv8az23OLC6Ki54Ul/E6xk227BFttqFV3YMtKx42H cCcDVZZy01n7JjzvO8ccaXmHIgR7utnqhBRNNq5Xc5ZhbkrUsNtiJmrZzVlgU6Ou 0wIDAQAB

Ideally it should not fallback to rsync but it is falling back for APNIC.

Could you please advise what could be the possible issue here?

Thank you.

Kind regards, Prajakta Patil.

From: Skanda Arasalingam @.> Sent: Wednesday, 24 November 2021 8:07 PM To: cloudflare/cfrpki @.>; cloudflare/cfrpki @.> Cc: Author @.>; Debashish Mukherjee @.>; Prajakta Yuvraj patil @.> Subject: RE: [cloudflare/cfrpki] APNIC rsync failback and LANIC AS0 (Issue #104)

Hi ties,

We are running version 1.2.2. We have noticed version 1.3.0 was running extremely slow in the LAB and pre-production. The behaviour with rsync was same for APNIC.

Debashish,

Can you please provide further comments the about the Docker image ?

regards, Skanda Arasalingam

From: Ties de Kock @.**@.>> Sent: Wednesday, 24 November 2021 6:12 PM To: cloudflare/cfrpki @.**@.>> Cc: Skanda Arasalingam @.**@.>>; Author @.**@.>> Subject: Re: [cloudflare/cfrpki] APNIC rsync failback and LANIC AS0 (Issue #104)

[External email] Please be cautious when clicking on any links or attachments.

What version of octorpki are you running - especially if using docker (since the public image is not up to date)?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/cloudflare/cfrpki/issues/104#issuecomment-977593357, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AVANCVGBIYHRT4BGVVXT2ELUNSF3NANCNFSM5ITB5BOA. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

Nov 25 15:07:39 rpki01 9cdbcb392fe3[1318]: time="2021-11-25T04:07:39Z" level=debug msg="Fetching rsync://rpki.apnic.net/member_repository/A91DCD09/EAC7EEA0112D11EC8D87057DC4F9AE02/rbmJeP11H5nZqOsq5y08DFUEU6Q.mft->cache/rpki.apnic.net/member_repository/A91DCD09/EAC7EEA0112D11EC8D87057DC4F9AE02/rbmJeP11H5nZqOsq5y08DFUEU6Q.mft" Nov 25 15:07:39 rpki01 9cdbcb392fe3[1318]: time="2021-11-25T04:07:39Z" level=debug msg="Fetching rsync://rpki.apnic.net/member_repository/A919EAAD/87C3168809D511EAB76C7317C4F9AE02/9TmDt6uDhsaZWHhP-vDmlyYkV6c.mft->cache/rpki.apnic.net/member_repository/A919EAAD/87C3168809D511EAB76C7317C4F9AE02/9TmDt6uDhsaZWHhP-vDmlyYkV6c.mft" Nov 25 15:07:39 rpki01 9cdbcb392fe3[1318]: time="2021-11-25T04:07:39Z" level=debug msg="Fetching rsync://rpki.apnic.net/member_repository/A91972B6/08243C8C132B11E9AE77AD7BC4F9AE02/leqvwkta9LFytYuwGnAc_hltoZk.mft->cache/rpki.apnic.net/member_repository/A91972B6/08243C8C132B11E9AE77AD7BC4F9AE02/leqvwkta9LFytYuwGnAc_hltoZk.mft" Nov 25 15:07:39 rpki01 9cdbcb392fe3[1318]: time="2021-11-25T04:07:39Z" level=debug msg="Fetching rsync://rpki.apnic.net/member_repository/A917E678/B897C582A19C11EB92A2DF29C4F9AE02/4kPzvoJVabsWrLN0vJjavmkLF8E.mft->cache/rpki.apnic.net/member_repository/A917E678/B897C582A19C11EB92A2DF29C4F9AE02/4kPzvoJVabsWrLN0vJjavmkLF8E.mft" Nov 25 15:10:57 rpki01 9cdbcb392fe3[1318]: time="2021-11-25T04:10:57Z" level=debug msg="Fetching rsync://rpki.sub.apnic.net/repository/A9192A980000/4/F67E02BEC4DE61E2EB5953B5DC335E56FFCDCE70.mft->cache/rpki.sub.apnic.net/repository/A9192A980000/4/F67E02BEC4DE61E2EB5953B5DC335E56FFCDCE70.mft" Nov 25 15:10:57 rpki01 9cdbcb392fe3[1318]: time="2021-11-25T04:10:57Z" level=debug msg="Fetching rsync://rpki.sub.apnic.net/repository/A91905300000/2/84587BF67D2ADA31FC7E0FF93DE2E266D33164C0.mft->cache/rpki.sub.apnic.net/repository/A91905300000/2/84587BF67D2ADA31FC7E0FF93DE2E266D33164C0.mft" Nov 25 15:10:58 rpki01 9cdbcb392fe3[1318]: time="2021-11-25T04:10:58Z" level=debug msg="Fetching rsync://rpki.sub.apnic.net/repository/A9114E750000/0/0108398CA988382C2A509BFDB39E146A76CF9DE0.mft->cache/rpki.sub.apnic.net/repository/A9114E750000/0/0108398CA988382C2A509BFDB39E146A76CF9DE0.mft" Nov 25 15:12:39 rpki01 9cdbcb392fe3[1318]: time="2021-11-25T04:12:39Z" level=debug msg="Fetching rsync://rpki.sub.apnic.net/repository/A9192A980000/1/D40581CA9DDACA9E110165B11DD2820DD7F532C0.mft->cache/rpki.sub.apnic.net/repository/A9192A980000/1/D40581CA9DDACA9E110165B11DD2820DD7F532C0.mft" Nov 25 15:12:51 rpki01 9cdbcb392fe3[1318]: time="2021-11-25T04:12:51Z" level=debug msg="Fetching rsync://rpki.sub.apnic.net/repository/A9192A980000/2/211A048890969FA7D4B6AEF8C020CDA4444EC2E5.mft->cache/rpki.sub.apnic.net/repository/A9192A980000/2/211A048890969FA7D4B6AEF8C020CDA4444EC2E5.mft" Nov 25 15:13:02 rpki01 9cdbcb392fe3[1318]: time="2021-11-25T04:13:02Z" level=debug msg="Fetching rsync://rpki.sub.apnic.net/repository/A9192A980000/3/5EAD10BE7EC295336E4B5680E0D393B677C3649A.mft->cache/rpki.sub.apnic.net/repository/A9192A980000/3/5EAD10BE7EC295336E4B5680E0D393B677C3649A.mft"

ARG src_dir="/octorpki"

FROM golang:alpine3.14 as builder ARG src_dir ARG LDFLAGS=""

RUN apk --update --no-cache add git && \ mkdir -p ${src_dir}

WORKDIR ${src_dir} COPY . .

RUN go build -ldflags "${LDFLAGS}" cmd/octorpki/octorpki.go

FROM alpine:3.14 ARG src_dir

RUN apk --update --no-cache add ca-certificates rsync && \ adduser -S -D -H -h / rpki && \ mkdir /cache && chmod 770 /cache && chown rpki:root /cache && \ touch rrdp.json && chown rpki rrdp.json USER rpki

COPY --from=builder ${src_dir}/octorpki ${src_dir}/cmd/octorpki/private.pem / COPY --from=builder ${src_dir}/cmd/octorpki/tals /tals

VOLUME ["/cache"]

ENTRYPOINT ["./octorpki"]

sarasalingam commented 2 years ago

Thanks mate. The LACNIC AS 0 issue is resolved. your advise has resolved the issue for us.

sarasalingam commented 2 years ago

We are not why exactly it is falling back to rsync for APNIC. Please advise.