MetPX / sarracenia

https://MetPX.github.io/sarracenia
GNU General Public License v2.0
45 stars 22 forks source link

tls_rigour strict broken. #329

Closed petersilva closed 4 years ago

petersilva commented 4 years ago

@benlapETS found this in building tests... if your specify strict to tls, it is totally broken. we see this code in sr_config.py:


self.tlsctx = ssl.create_default_context()
self.tlsctx.check_hostname = True
self.tlsctx.verify_mode = ssl.CERT_REQUIRED
self.tlsctx.protocol = ssl.PROTOCOL_TLSv1_2

which bombs in many different ways... something that is closer to correct:


self.tlsctx = ssl.SSLContext(ssl.PROTOCOL_TLSv1_2)
self.tlsctx.verify_mode = ssl.CERT_REQUIRED
self.tlsctx.load_default_certs()
self.tlsctx.verify_flags=ssl.VERIFY_X509_STRICT|ssl.VERIFY_X509_TRUSTED_FIRST
self.tlsctx.check_hostname=True

This works with properly configured sites... but there is something missing... if one includes ssl.VERIFY_CRL_CHECK_CHAIN in verify_flags... always get:


ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get certificate CRL (_ssl.c:1108)

It is unclear how to add CRL chain verification.

benlapETS commented 4 years ago

https://pythonhosted.org/python-libtls/tutorial.html

benlapETS commented 4 years ago

from ssl doc, have you read that ?

Protocol versions

SSL versions 2 and 3 are considered insecure and are therefore dangerous to use. If you want maximum compatibility between clients and servers, it is recommended to use PROTOCOL_TLS_CLIENT or PROTOCOL_TLS_SERVER as the protocol version. SSLv2 and SSLv3 are disabled by default.

client_context = ssl.SSLContext(ssl.PROTOCOL_TLS_CLIENT)
client_context.options |= ssl.OP_NO_TLSv1
client_context.options |= ssl.OP_NO_TLSv1_1

The SSL context created above will only allow TLSv1.2 and later (if supported by your system) connections to a server. PROTOCOL_TLS_CLIENT implies certificate validation and hostname checks by default. You have to load certificates into the context.

But maybe we want both CLIENT and SERVER, so it would be PROTOCOL_TLS

benlapETS commented 4 years ago

Tried that :

                         self.tlsctx = ssl.SSLContext(ssl.PROTOCOL_TLS_CLIENT)
                         self.tlsctx.options |= ssl.OP_NO_TLSv1
                         self.tlsctx.options |= ssl.OP_NO_TLSv1_1
                         self.tlsctx.check_hostname = True
                         self.tlsctx.verify_mode = ssl.CERT_REQUIRED
                         self.tlsctx.load_default_certs()
                         self.tlsctx.verify_flags = ssl.VERIFY_CRL_CHECK_CHAIN

Got this when subscribe with dd_amis:

2020-04-24 10:45:08,063 [ERROR] Download failed 3 https://dd5.weather.gc.ca//bulletins/alphanumeric/20200424/UB/KWBC/14/UBUS31_KWBC_241440___43345
2020-04-24 10:45:08,063 [DEBUG] Exception details: 
Traceback (most recent call last):
  File "/home/benoit/anaconda3/envs/sarracenia/lib/python3.8/urllib/request.py", line 1319, in do_open
    h.request(req.get_method(), req.selector, req.data, headers,
  File "/home/benoit/anaconda3/envs/sarracenia/lib/python3.8/http/client.py", line 1230, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/home/benoit/anaconda3/envs/sarracenia/lib/python3.8/http/client.py", line 1276, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/home/benoit/anaconda3/envs/sarracenia/lib/python3.8/http/client.py", line 1225, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/home/benoit/anaconda3/envs/sarracenia/lib/python3.8/http/client.py", line 1004, in _send_output
    self.send(msg)
  File "/home/benoit/anaconda3/envs/sarracenia/lib/python3.8/http/client.py", line 944, in send
    self.connect()
  File "/home/benoit/anaconda3/envs/sarracenia/lib/python3.8/http/client.py", line 1399, in connect
    self.sock = self._context.wrap_socket(self.sock,
  File "/home/benoit/anaconda3/envs/sarracenia/lib/python3.8/ssl.py", line 500, in wrap_socket
    return self.sslsocket_class._create(
  File "/home/benoit/anaconda3/envs/sarracenia/lib/python3.8/ssl.py", line 1040, in _create
    self.do_handshake()
  File "/home/benoit/anaconda3/envs/sarracenia/lib/python3.8/ssl.py", line 1309, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1108)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/benoit/git/sarracenia/sarra/sr_util.py", line 482, in download
    self.get(remote_file,new_lock,remote_offset,msg.local_offset,msg.length)
  File "/home/benoit/git/sarracenia/sarra/sr_util.py", line 550, in get
    self.proto.get(remote_file, local_file, remote_offset, local_offset, length)
  File "/home/benoit/git/sarracenia/sarra/sr_http.py", line 131, in get
    ok  = self.__open__(url, remote_offset, length )
  File "/home/benoit/git/sarracenia/sarra/sr_http.py", line 265, in __open__
    self.http = urllib.request.urlopen(self.req, timeout=self.timeout, context=ctx)
  File "/home/benoit/anaconda3/envs/sarracenia/lib/python3.8/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/home/benoit/anaconda3/envs/sarracenia/lib/python3.8/urllib/request.py", line 525, in open
    response = self._open(req, data)
  File "/home/benoit/anaconda3/envs/sarracenia/lib/python3.8/urllib/request.py", line 542, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/home/benoit/anaconda3/envs/sarracenia/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/home/benoit/anaconda3/envs/sarracenia/lib/python3.8/urllib/request.py", line 1362, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "/home/benoit/anaconda3/envs/sarracenia/lib/python3.8/urllib/request.py", line 1322, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1108)>
benlapETS commented 4 years ago

Having self signed certificates in the chain is a problem. But that is understandable, as we load default certificate. We would need to load our self signed certificate manually there ? But would it be ok in a security point of view ?

benlapETS commented 4 years ago

Commenting out verify flag it was working well on dd...

petersilva commented 4 years ago

yeah, except there aren´t any self-signed certs in the chain. I tested by just using the calls to build a context in a python interpreter... and did the open on google.com:

>>> c=ssl.SSLContext( ssl.PROTOCOL_TLSv1_2 )
>>> c.verify_mode=ssl.CERT_REQUIRED
>>> c.load_default_certs()
>>> c.set_default_verify_paths()
>>> c.Check_hostname=True
>>> c.check_hostname=True
>>> c.verify_flags=ssl.VERIFY_CRL_CHECK_CHAIN|ssl.VERIFY_X509_STRICT|ssl.VERIFY_X509_TRUSTED_FIRST
>>> h=ur.urlopen("https://www.google.com", context=c)
Traceback (most recent call last):
  File "/usr/lib/python3.8/urllib/request.py", line 1319, in do_open
    h.request(req.get_method(), req.selector, req.data, headers,
  File "/usr/lib/python3.8/http/client.py", line 1230, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1276, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1225, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1004, in _send_output
    self.send(msg)
  File "/usr/lib/python3.8/http/client.py", line 944, in send
    self.connect()
  File "/usr/lib/python3.8/http/client.py", line 1399, in connect
    self.sock = self._context.wrap_socket(self.sock,
  File "/usr/lib/python3.8/ssl.py", line 500, in wrap_socket
    return self.sslsocket_class._create(
  File "/usr/lib/python3.8/ssl.py", line 1040, in _create
    self.do_handshake()
  File "/usr/lib/python3.8/ssl.py", line 1309, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get certificate CRL (_ssl.c:1108)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.8/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.8/urllib/request.py", line 525, in open
    response = self._open(req, data)
  File "/usr/lib/python3.8/urllib/request.py", line 542, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/usr/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.8/urllib/request.py", line 1362, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "/usr/lib/python3.8/urllib/request.py", line 1322, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get certificate CRL (_ssl.c:1108)>
>>>
benlapETS commented 4 years ago

From the doc: https://docs.python.org/3/library/ssl.html#self-signed-certificates

Self-signed certificates

If you are going to create a server that provides SSL-encrypted connection services, you will need to acquire a certificate for that service. There are many ways of acquiring appropriate certificates, such as buying one from a certification authority. Another common practice is to generate a self-signed certificate. The simplest way to do this is with the OpenSSL package, using something like the following:

benlapETS commented 4 years ago

You are not having the same error than me though...

petersilva commented 4 years ago

yeah... I saw that issue when load_default_certs() wasn't invoked... so all certs were interpreted as self-signed... when I tried your settings in the python interpreter, I get the same results with dd.weather or google.com... so I don't think there is any problem with dd.weather...

benlapETS commented 4 years ago

Yeah I got the same too...

benlapETS commented 4 years ago

That is correct I must have forgotten the default the first time

petersilva commented 4 years ago

yeah, so everything other than CRL is now clean... I think we should probably just go with that, and forget about CRLs for now... I don't see a reasonable/practical way to add honouring of CRL's (it should be built-in to the ssl library... but since it isn't, we are kind of stuck.)

benlapETS commented 4 years ago

Agreed, I tested something with CRL using openssl , that didnt worked either (tested it with google, dd weather, github). Here is the link of what I tried: https://raymii.org/s/articles/OpenSSL_manually_verify_a_certificate_against_a_CRL.html

benlapETS commented 4 years ago

Every times it got stuck with something like that:

> openssl verify -crl_check -CAfile crl_chain.pem google.pem
C = US, O = Google Trust Services, CN = GTS CA 1O1
error 2 at 1 depth lookup: unable to get issuer certificate
error google.pem: verification failed
benlapETS commented 4 years ago

The idea was if this would have worked, we could have make it worked providing a CA location with load_verify_locations()...

this link dragged into that hole: https://stackoverflow.com/questions/39297240/python-failed-to-verify-any-crls-for-ssl-tls-connections

benlapETS commented 4 years ago

Also found that https://bugs.python.org/issue34078 which comes directly from the devs... this doesnt work as expected either...

petersilva commented 4 years ago

yeah, it sounds like we need to do this:


fractal%  openssl s_client -connect revoked.badssl.com:443 -servername revoked.badssl.com | openssl x509 -text -noout | grep crl
depth=2 C = US, O = DigiCert Inc, OU = www.digicert.com, CN = DigiCert Global Root CA
verify return:1
depth=1 C = US, O = DigiCert Inc, CN = DigiCert SHA2 Secure Server CA
verify return:1
depth=0 C = US, ST = California, L = Walnut Creek, O = Lucas Garron Torres, CN = revoked.badssl.com
verify return:1
                  URI:http://crl3.digicert.com/ssca-sha2-g6.crl
                  URI:http://crl4.digicert.com/ssca-sha2-g6.crl

^C
fractal% curl -O http://crl3.digicert.com/ssca-sha2-g6.crl
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  235k  100  235k    0     0  1637k      0 --:--:-- --:--:-- --:--:-- 1637k
fractal% openssl crl -in ssca-sha2-g6.crl -inform DER -out ssca-sha2-g6.pem.crl -outform PEM
fractal%
fractal% ./testcrl.py
Traceback (most recent call last):
  File "./testcrl.py", line 19, in <module>
    s.connect(('revoked.badssl.com', 443))
  File "/usr/lib/python3.8/ssl.py", line 1342, in connect
    self._real_connect(addr, False)
  File "/usr/lib/python3.8/ssl.py", line 1333, in _real_connect
    self.do_handshake()
  File "/usr/lib/python3.8/ssl.py", line 1309, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate revoked (_ssl.c:1108)
fractal%

in other words, look up the CRL certificate for each site, download it to a local store somewhere, convert it to a format the python ssl format can deal with (.pem) provide the correct CRL for each host looked up (maintain a mapping), and then compare them to what we got.

It sounds like something that should be in the library, not something every client using SSL should need to do. It smells bad.

petersilva commented 4 years ago

Please prepare a patch without the CRL stuff. and we can leave CRL as a future enhancement. or not...

further down the rabbit hole, just for interest' sake:

https://www.ssl.com/article/how-do-browsers-handle-revoked-ssl-tls-certificates/

petersilva commented 4 years ago

fix 987f5f8db33f81d5f150d4a68f7518955f3641e7 merged!

petersilva commented 4 years ago

ugh... OK the code works for getting the context... but the downloads all fail, when the configuration is valid:


020-04-24 18:48:20,207 [INFO] Number of messages in retry list 7
2020-04-24 18:48:20,207 [INFO] sr_retry on_heartbeat elapse 0.002819
2020-04-24 18:48:20,298 [ERROR] Download failed 5 https://hpfx.collab.science.gc.ca/20200424/WXO-DD/bulletins/alphanumeric/20200424/SR/KWAL/22/SRCN40_KWAL_242244___45508 
2020-04-24 18:48:20,299 [ERROR] Failed to reach server. Reason: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:852

so perhaps more work to do...

benlapETS commented 4 years ago

I don't understand why it would work on dd.weather and no hpfx.collab. I will fix that..

petersilva commented 4 years ago

that's not it.. it works fine on both... we just only were looking at login... not subsequent download. the first patch is good, it fixes the option parsing/login stuff, but the subsequent downloads using the context fails...

benlapETS commented 4 years ago

I have looked at download..:

2020-04-27 10:22:08,587 [INFO] AMQP  broker(dd.weather.gc.ca) user(anonymous) vhost(/)
2020-04-27 10:22:08,587 [INFO] Using amqp module (AMQP 0-9-1)
2020-04-27 10:22:08,820 [INFO] Binding queue q_anonymous.sr_subscribe.dd_amis.06113827.02597803 with key v02.post.bulletins.alphanumeric.# from exchange xpublic on broker amqps://anonymous@dd.weather.gc.ca/
2020-04-27 10:22:08,919 [INFO] declared queue q_anonymous.sr_subscribe.dd_amis.06113827.02597803 (anonymous@dd.weather.gc.ca) 
2020-04-27 10:22:08,945 [INFO] reading from to anonymous@dd.weather.gc.ca, exchange: xpublic
2020-04-27 10:22:08,987 [INFO] report_back to anonymous@dd.weather.gc.ca, exchange: xs_anonymous
2020-04-27 10:22:08,987 [INFO] sr_retry on_heartbeat
2020-04-27 10:22:08,989 [INFO] No retry in list
2020-04-27 10:22:08,989 [INFO] sr_retry on_heartbeat elapse 0.001865
2020-04-27 10:22:10,611 [INFO] file_log downloaded to: /home/benoit/git/sarracenia/SPCN49_CWAO_271421__CYCX_33973
2020-04-27 10:22:13,747 [INFO] file_log downloaded to: /home/benoit/git/sarracenia/SRME20_KWAL_271421___64816
2020-04-27 10:22:14,151 [INFO] file_log downloaded to: /home/benoit/git/sarracenia/SXCN40_KWAL_271421___48756
2020-04-27 10:22:14,384 [INFO] file_log downloaded to: /home/benoit/git/sarracenia/SRCN40_KWAL_271421___59280
2020-04-27 10:22:14,598 [INFO] file_log downloaded to: /home/benoit/git/sarracenia/SAXX60_KWBC_271400_RRR__34487
2020-04-27 10:22:14,815 [INFO] file_log downloaded to: /home/benoit/git/sarracenia/SAUS70_KWBC_271422_RRC__18207
2020-04-27 10:22:15,019 [INFO] file_log downloaded to: /home/benoit/git/sarracenia/SRCN40_KWAL_271421___36347
benlapETS commented 4 years ago

with this config

# this is a feed of wmo bulletin (a set called AMIS in the old times)

# if this host doesn't work, comment the line and use the default one
tls_rigour strict
broker amqps://dd.weather.gc.ca/
# broker amqps://hpfx.collab.science.gc.ca/

# instances: number of downloading processes to run at once.  defaults to 1. Not enough for this case
instances 5

# expire, in operational use, should be longer than longest expected interruption
expire 10m

subtopic bulletins.alphanumeric.#

accept .*
benlapETS commented 4 years ago

removed subtopic and it did download from hpfx:

2020-04-27 10:36:58,587 [INFO] AMQP  broker(hpfx.collab.science.gc.ca) user(anonymous) vhost(/)
2020-04-27 10:36:58,587 [INFO] Using amqp module (AMQP 0-9-1)
2020-04-27 10:36:58,787 [INFO] Binding queue q_anonymous.sr_subscribe.dd_amis.06113827.02597803 with key v02.post.# from exchange xpublic on broker amqps://anonymous@hpfx.collab.science.gc.ca/
2020-04-27 10:36:58,816 [INFO] declared queue q_anonymous.sr_subscribe.dd_amis.06113827.02597803 (anonymous@hpfx.collab.science.gc.ca) 
2020-04-27 10:36:58,830 [INFO] reading from to anonymous@hpfx.collab.science.gc.ca, exchange: xpublic
2020-04-27 10:36:58,857 [INFO] report_back to anonymous@hpfx.collab.science.gc.ca, exchange: xs_anonymous
2020-04-27 10:36:58,857 [INFO] sr_retry on_heartbeat
2020-04-27 10:36:58,859 [INFO] No retry in list
2020-04-27 10:36:58,860 [INFO] sr_retry on_heartbeat elapse 0.002438
2020-04-27 10:36:59,092 [INFO] file_log downloaded to: /home/benoit/git/sarracenia/2020-04-27-1433-CWOO-AUTO-minute-swob.xml
2020-04-27 10:36:59,314 [INFO] file_log downloaded to: /home/benoit/git/sarracenia/2020-04-27-1436-CZOC-AUTO-minute-swob.xml
2020-04-27 10:36:59,540 [INFO] file_log downloaded to: /home/benoit/git/sarracenia/SRCN40_KWAL_271436___5879
2020-04-27 10:36:59,749 [INFO] file_log downloaded to: /home/benoit/git/sarracenia/SXCN40_KWAL_271436___51573
2020-04-27 10:36:59,985 [INFO] file_log downloaded to: /home/benoit/git/sarracenia/2020-04-27-1435-CZVM-AUTO-minute-swob.xml
2020-04-27 10:37:00,222 [INFO] file_log downloaded to: /home/benoit/git/sarracenia/2020-04-27-1436-CWPU-AUTO-minute-swob.xml
2020-04-27 10:37:00,435 [INFO] file_log downloaded to: /home/benoit/git/sarracenia/2020-04-27-1436-CVQZ-AUTO-minute-swob.xml
2020-04-27 10:37:00,659 [INFO] file_log downloaded to: /home/benoit/git/sarracenia/2020-04-27-1436-CWMM-AUTO-minute-swob.xml
benlapETS commented 4 years ago

ok a new thing on hpfx (dd still works):

2020-04-27 10:58:33,016 [ERROR] AMQP cannot connect to hpfx.collab.science.gc.ca with (0, 0): (403) ACCESS_REFUSED - Login was refused using authentication mechanism AMQPLAIN. For details see the broker logfile.

benlapETS commented 4 years ago

From what I see those problems may be unrelated to tls_rigour, the job I restarted from travis also failed with socket problems:

ERROR Summary:
  688  sr_shovel_pclean_f90        (5 file)  [ERROR] file not in folder posted_by_shim with 20.086s elapsed
  223  sr_subscribe_q_f71          (5 file)  [ERROR] util/writelocal mismatched file length writing CACN00_CWAO_270701__WRT_56266.
  52   sr_subscribe_ftp_f70        (6 file)  [ERROR] Download failed 3 ftp://anonymous@localhost:2121//sent_by_tsource2send/202004
  26   sr_cpump_xvan_f14           (2 file)  [ERROR] binding failed: server channel error 404h, message: NOT_FOUND - no exchange '
  16   sr_cpump_pelle_dd1_f04      (2 file)  [ERROR] Failed to open AMQP socket host: hpfx.collab.science.gc.ca, port: 5671
  16   sr_cpump_pelle_dd1_f04      (2 file)  [ERROR] failed opening AMQP socket: SSL handshake failed
  13   srposter_f00                (1 file)  [ERROR] could not post /home/travis/sarra_devdocroot/sent_by_tsource2send/20200427/WX
  8    sr_cpump_pelle_dd1_f04      (2 file)  [ERROR] Failed AMQP login user: anonymous
  8    sr_cpump_pelle_dd1_f04      (2 file)  [ERROR] failed AMQP login: server connection error 403h, message: ACCESS_REFUSED - Lo
  4    sr_sender_tsource2send_f50  (4 file)  [ERROR] Delivery failed /home/travis/sarra_devdocroot/sent_by_tsource2send/20200427/W

WARNING Summary:
  21101  sr_report_tsarra_f20         (23 file)  [WARNING] total: Excessive lag! downloading too slowly/late 2 minutes ago behind
  84     sr_subscribe_cp_f61          (15 file)  [WARNING] could not move /home/travis/sarra_devdocroot/downloaded_by_sub_cp/20200
  26     sr_cpump_xvan_f14            (2 file)   [WARNING] Due to failure, sleeping for 16 seconds to try to re-connect.
  26     sr_subscribe_ftp_f70         (6 file)   [WARNING] downloading again, attempt 2
  10     sr_shovel_pclean_f90         (10 file)  [WARNING] on_heartbeat spent more than 10% of heartbeat (20)
  10     sr_shovel_pclean_f90         (10 file)  [WARNING] heartbeat set to 120
  5      sr_cpost_veille_f34          (1 file)   [WARNING] INFO: watch, one pass takes longer (2.00304) than sleep interval (2), n
  4      sr_sender_tsource2send_f50   (4 file)   [WARNING] sending again, attempt 2
  1      sr_watch_f40                 (1 file)   [WARNING] Should invoke 3: /home/travis/virtualenv/python3.6.7/bin/sr_watch [args
  1      sr_subscribe_rabbitmqtt_f31  (1 file)   [WARNING] mv /home/travis/sarra_devdocroot/downloaded_by_sub_rabbitmqtt/20200427/
petersilva commented 4 years ago

anyways... tls_rigour does work in the normal case. I tried it this morning, and hpfx works fine now with tls_rigour strict.

for the other points: the tests are always flakey, before and now... I do run restart job after nearly every commit. the ones that fail are random... it's an issue I think we would love to fix, but nothing obvious. if they pass after a restart or two, consider it a pass.

The subtopic thing... hpfx.collab != dd. This is a known thing, you need to change subtopic: dd subtopic x.y --> hpfx subtopic *.WXO-DD.x.y dd is first gen datamart, all the other ones are second gen, which add the date and source to the tree. so on all current datamarts, the trees are two levels deeper. flux will be like this, as are all the internal ones. dd is kept like this to prevent user revolt, provide migration path.

petersilva commented 4 years ago

I think perhaps the retry is in question? i.e. it only fails when doing a retry? ... need to clarify...

petersilva commented 4 years ago

works in 2.20.04b3.