NOAA-OWP / wres

Code and scripts for the Water Resources Evaluation Service
Other
2 stars 1 forks source link

Swap to using environment variables for hard coded certs #296

Closed epag closed 1 week ago

epag commented 2 weeks ago

He currently have hard coded cert names in our broker

https://github.com/NOAA-OWP/wres/blob/bc993975a9ce41be6d0dc9eb54daa36d8ecb4e1c/wres-broker/nonsrc/rabbitmq.conf#L10-L23

Also there is this hard coded pass for our trust store. Which doesn't really do anything but for craftmanship should be extracted as well ( think it also shows up in a test file as well)

https://github.com/NOAA-OWP/wres/blob/bc993975a9ce41be6d0dc9eb54daa36d8ecb4e1c/wres-system/src/wres/system/SSLStuffThatTrustsOneCertificate.java#L111-L115

https://github.com/NOAA-OWP/wres/blob/bc993975a9ce41be6d0dc9eb54daa36d8ecb4e1c/wres-messages/src/main/java/wres/messages/BrokerHelper.java#L171-L172

HankHerr-NOAA commented 2 weeks ago

Thanks for the ticket!

Rather than just winging the changes on my own, I'll se how we use the other environment variables and piggy back off of that for a consistent approach. That is, unless those other environment variables don't provide a good basis for the change. I'll see.

I plan to work on this starting Tuesday. I forgot about my documentation task tomorrow which should take most of the morning. Documentation is every software engineer's favorite activity.

Hank

james-d-brown commented 2 weeks ago

Also, see the wres/wres-eventsbroker/nonsrc/bootstrap.xml for the eventsbroker monitor. The ports use system properties, but the keystore and truststore are hard-coded names. You can see how the system properties get mapped to environment variables in the docker-entrypoint.sh. For example:

ARTEMIS_CLUSTER_PROPS="-Dactivemq.remoting.amqp.port=${BROKER_AMQP_PORT} -Dactivemq.remoting.http.port=${BROKER_HTTP_PORT} -Dhawtio.disableProxy=true -Dhawtio.realm=activemq-cert -Dhawtio.role=wres-eventsbroker-admin -Dhawtio.offline=true -Dhawtio.sessionTimeout=86400 -Dhawtio.rolePrincipalClasses=org.apache.activemq.artemis.spi.core.security.jaas.RolePrincipal"

The environment variables are, in turn, defined in the Dockerfile, but you can abstract them out for the cert names, which should be configurable.

# Set env vars expected by BROKER
ENV BROKER_HOME=/usr/local
ENV BROKER_INSTANCE=/usr/local/wres-eventsbroker
ENV BROKER_CONFIG=/usr/local/wres-eventsbroker-config
ENV BROKER_WORK=/container_home
ENV BROKER_HTTP_PORT=15673
ENV BROKER_AMQP_PORT=5673
ENV ANONYMOUS_LOGIN=false
HankHerr-NOAA commented 2 weeks ago

Thanks, James. I'll start digging into it on Tuesday.

Hank

HankHerr-NOAA commented 1 week ago

Focusing on the cert configuration options Evan mentioned, and having no experience with RabbitMQ, I used a basic web search to see if environment variables can be used in a configuration file. According to the website,

https://www.rabbitmq.com/docs/configure#env-variable-interpolation

it can and the syntax is, $(env var). So, for example, ssl_options.cacertfile could be set as:

ssl_options.cacertfile = $(RABBITMQ_SSL_OPTIONS_CACERTFILE)

The environment variables would then be set in the docker entry .yml:

        environment:
         - RABBITMQ_CONFIG_FILE=rabbitmq.conf
        # rabbitmq.conf should have 360m specified as high watermark

And that can use the .env file. My proposed 6 environment variables for the broker will be:

ssl_options.cacertfile = $(RABBITMQ_SSL_OPTIONS_CACERTFILE)
ssl_options.certfile = $(RABBITMQ_SSL_OPTIONS_CERTFILE)
ssl_options.keyfile = $(RABBITMQ_SSL_OPTIONS_KEYFILE)
management.ssl.cacertfile = $(RABBITMQ_MANAGEMENT_SSL_CACERTFILE)
management.ssl.certfile = $(RABBITMQ_MANAGEMENT_SSL_CERTFILE) 
management.ssl.keyfile = $(RABBITMQ_MANAGEMENT_SSL_KEYFILE)

Each of those will need a new line the compose .yml and a new entry in the .env.

Does that sound about right? Again, I'm focusing on the cert options pointed out by Evan, first.

Hank

HankHerr-NOAA commented 1 week ago

I'm going to eat lunch, but, when I return, I'm going to start following the "Code Lifecycle" outlined by Evan in the Google Doc (which should probably be turned into a wiki for folks like me who haven't developed in GitHub before).

Again, please let me know if I barking up the right tree. Thanks,

Hank

HankHerr-NOAA commented 1 week ago

Oh, and I'm planning to use the -dev COWRES to test my changes.

Hank

epag commented 1 week ago

The above looks good to me

HankHerr-NOAA commented 1 week ago

I tested my ability to deploy using home-built .zip files, and it works. I deployed to -dev, and, after bringing back up the Postgres server in -dev (which was down since a machine reboot on Aug 28; I've notified ITSG), a smoke test passed. I'm going to use that as a starting point for making the broker configuration and Docker changes.

Thanks for your help, Evan!

Hank

HankHerr-NOAA commented 1 week ago

James:

FYI... When I try to run the unit tests on nwcal-ised-dev1, it becomes stuck at this point:

<=======------> 58% EXECUTING [51m]
> :wres-io:test > 0 tests completed
> :wres-io:test > Executing test wres.io.database.QueryTest

It never completes the QueryTest. Any thoughts?

Note that I am able to run the test on my laptop when executing the gradle build using Eclipse (there is some funky ASCII stuff happening when I copy this from the Eclipse Console):

wres.io.database.QueryTest

  Test simpleExecuteTest PASSED (1.6s)

If its a dumb user mistake, then I'm not sure it merits a ticket. Hence, I'll start by mentioning it here.

Hank

james-d-brown commented 1 week ago

Can't be a user mistake if this is a unit test. All db unit tests are executed against an h2 in-memory database. Other than looking at the debug test output, I've got nothin'. For example, could be that ports are conflicting or something...

james-d-brown commented 1 week ago

Although probably not because I don't think calls to h2 in memory (edit: embedded) are routed via a network interface. Could be the h2 driver, though. Also depends how you are running. You will always need to execute via gradle, never via the ide (edit: I mean, without the gradle wrapper).

HankHerr-NOAA commented 1 week ago

I'll take a closer look at that tomorrow. I am using Gradle to run the JUnit tests. I'll test it on other machines to see if there is one in particular that causes issues. I'll also look into getting debug output. This does sound like a separate ticket, so I'll create one tomorrow.

I'm testing changes to the broker now in -dev01. I added the environment variable references to the RabbitMQ config and compose .yml, and tried to bring it up without editing .env, first. The broker failed to come up, as expected since the environment variables were not set up.

I then added the .env values. That's when I discovered that I just happen to be using deprecated environment variables:

error: RABBITMQ_MANAGEMENT_SSL_CACERTFILE is set but deprecated
error: RABBITMQ_MANAGEMENT_SSL_CERTFILE is set but deprecated
error: RABBITMQ_MANAGEMENT_SSL_KEYFILE is set but deprecated
error: deprecated environment variables detected

Great. I picked the logical name, which happened to match the logical names previously used, and hence, errors. Awesome. Gotta pick another name,

Hank

HankHerr-NOAA commented 1 week ago

I modified the environment variables to include "WRES":

ssl_options.cacertfile = $(WRES_RABBITMQ_SSL_OPTIONS_CACERTFILE)
ssl_options.certfile = $(WRES_RABBITMQ_SSL_OPTIONS_CERTFILE)
ssl_options.keyfile = $(WRES_RABBITMQ_SSL_OPTIONS_KEYFILE)
...
management.ssl.cacertfile = $(WRES_RABBITMQ_MANAGEMENT_SSL_CACERTFILE)
management.ssl.certfile = $(WRES_RABBITMQ_MANAGEMENT_SSL_CERTFILE)
management.ssl.keyfile = $(WRES_RABBITMQ_MANAGEMENT_SSL_KEYFILE)

The names are getting a big long, but what do you all think? I can shorten "OPTIONS" to "OPTS" and "MANAGEMENT" to "MGMT", for example.

But the point is that this works as a step 1 (EDIT: That is, I was able to deploy to -dev and run a smoke test.). Yay!

Hank

HankHerr-NOAA commented 1 week ago

Pending a desire from someone to shorten the names (I'm willing to let them remain long), I'm sitting on the changes in a branch locally. Evan also pointed out two hardcoded instances of "changeit" in the code base that I need to look into. And James pointed out that key store and trust store variables should be changed in the events broker bootstrap.xml:

       <binding uri="https://0.0.0.0:${activemq.remoting.http.port}" 
                clientAuth="true" 
                keyStorePath="/wres_secrets/wres-eventsbroker_server_keystore.p12" 
                keyStorePassword="..." 
                trustStorePath="/wres_secrets/wres-eventsbroker_server_truststore.p12" 
                trustStorePassword="..." includedTLSProtocols="TLSv1.2,TLSv1.3">
           <app url="activemq-branding" war="activemq-branding.war"/>
           <app url="artemis-plugin" war="artemis-plugin.war"/>
           <app url="console" war="console.war"/>
       </binding>

I'm assuming both the path and passwords should be turned into environment variables, correct?

I'm hoping to make those changes tomorrow, but it may not be until later in the week. Thanks,

Hank

james-d-brown commented 1 week ago

Yes, but not directly inserted there. See above. The system properties go there, just like ${activemq.remoting.http.port}. The environment variables are mapped to system properties in the entrypoint script. See instructions earlier in this thread.

HankHerr-NOAA commented 1 week ago

Right. That was my undestanding; I was just cutting corners in my description. The environment variables will come from the .env which will be passed through the compose to the image which will pick them up in the entry script and use them to set Java system properties.

Unless someone has an issue with those long environment variable names, I'll probably use the same approach I used for the broker to name the environment variables. Something like, "WRESAMQP[system property in all caps and underscores instead of periods]". I'll document the proposed names here tomorrow before making the changes.

Thanks,

Hank

james-d-brown commented 1 week ago

I wouldn't make a habit of qualifying these variables with "WRES" because, afterall, they are all WRES. If they overlap with variable names that are used by dependencies, sure.

james-d-brown commented 1 week ago

More specifically, for the eventsbroker var names, I would continue the pattern already used, i.e., BROKER_*

HankHerr-NOAA commented 1 week ago

Thanks, James. I'll follow the pattern of "BROKER_" but am a bit worried about overuse of "broker". In the .env file, that could lead to confusion, but hopefully folks will see the "WRESRABBITMQ" prefix and know that that's the "broker" between the tasker and workers, and therefore conclude that the "BROKER" environment variables are for the "events broker".

I'm looking at the "changeit" code changes now. I'm assuming the password will be required. So I think this,

        try
        {
            customTrustStore.load( null,
                                   "changeit".toCharArray() );
        }
        catch ( IOException | NoSuchAlgorithmException | CertificateException e )
        {
            throw new IllegalStateException( "WRES could not create a custom trust store", e );
        }

should become this:

       try
        {
            String password = System.getProperty( TRUST_STORE_PASSWORD );
            if ( password != null )
            {
                customTrustStore.load( null,
                                       "changeit".toCharArray() );
            }
            else
            {
                throw new IllegalStateException( "The required environment variable " + TRUST_STORE_PASSWORD
                                                 + " was not found." );
            }
        }
        catch ( IOException | NoSuchAlgorithmException | CertificateException e )
        {
            throw new IllegalStateException( "WRES could not create a custom trust store", e );
        }

Or will we make it optional and try to define the custom trust store without a password if the environment variable is not found?

Hank

james-d-brown commented 1 week ago

If you want to adapt to EVENTSBROKER, I am fine with that, the main thing is consistency across names in a given context, as well as brevity, which you mentioned before.

Note there is a typo in your code snippet above (just in case you copy/paste).

HankHerr-NOAA commented 1 week ago

Thanks for pointing out the typo. And, yes, consistency will be key. I'll stick with BROKER since we are using it elsewhere.

I'm not sure I agree with the need to turn "changeit" in SSLStuffThatTrustsOneCertificate into an environment variable. That password is being defined for an internal truststore that contains a provided cert:

     * @param derEncodedCertificate the certificate, intermediate cert, or CA

The password is not tied to the password for a truststore in a file somewhere. If I'm misunderstanding that code, please let me know. If I'm understanding it correctly, but one of you feels that an internal trust store password needs to come from an environment variable, I can make it happen, I'm just not sure its useful.

I'm now looking at BrokerHelper.java. Thanks,

Hank

HankHerr-NOAA commented 1 week ago

The BrokerHelper.java makes more sense to turn the password into an environment variable. The code therein creates a trust store corresponding to that specified by -Dwres.trustStore, which is defined the compose .yml as -Dwres.trustStore=${WRES_TRUST_STORE}. The environment variable WRES_TRUST_STORE points to the file trustedCertificateAuthorities.jks. That file is created externally to the WRES following instructions in the VLab wiki. The environment variable to create will be WRES_TRUST_STORE_PASSWORD and will be passed in via the system property -Dwres.trustStorePassword.

I'll make that change now.

Hank

james-d-brown commented 1 week ago

If the method is passed an input stream, then it should probably be passed the, er, password too, and that should come from the relevant env variable in each context where the method is called and an input stream supplied from the classpath or file system. As far as I can see, SSLStuffThatTrustsOneCertificate follows that pattern. It takes a stream that could originate from anywhere, caller decides. I didn't follow through to whatever public api methods are calling it and the various contexts used, but I think it makes sense to have a password as a parameter anytime the cert is supplied as a parameter.

HankHerr-NOAA commented 1 week ago

The class is taking a, "certificate, intermediate cert, or CA", which don't use passwords, not a trust store that uses a password. The job is to then wrap the .pem in a trust store for internal use in the context of, for example, reading from WRDS. There is no password to associate with the provided cert. Hence, a password argument, if it existed, would logically be the password to assign to the returned trust store. Okay. But, as is already shown by how SSLStuffThatTrustsOneCertificate is currently used, that password is not needed outside of this method. So why require it be passed it by the caller? In fact, why even have a password (if its not required; I would have to check)? After all, its just an internal trust store.

I base the above on how the provided file or stream to SSLStuffThatTrustsOneCertificate is being used. Specifically, after the custom trust store is created, its being used as follows:

            CertificateFactory certificateFactory = CertificateFactory.getInstance( "X.509" );
            Certificate certificate = certificateFactory.generateCertificate( derEncodedCertificate );
            customTrustStore.setCertificateEntry( "SoleTrusted", certificate );

Hence, the derEncodedCertificate must be a single, x.509 cert. Again, unless I'm misreading it.

Thanks,

Hank

HankHerr-NOAA commented 1 week ago

The BrokerHelper.java is a much more clear cut case for specifying the password as a property. The name of the trust store is specified by -Dwres.trustStore (traced back to env var WRES_TRUST_STORE), so it makes sense to specify the password by, -Dwres.trustStorePassword (to be provided by env var WRES_TRUST_STORE_PASSWORD).

One concern I have is that the property is being used without checking if its specified. Specifically, here is where the wres.trustStore is read in and used:

        String trustStore = System.getProperty( TRUST_STORE_PROPERTY_NAME );

        Path alternativeTrustStorePath = Paths.get( trustStore );
        File alternativeTrustStoreFile = alternativeTrustStorePath.toFile();

I'm assuming if trustStore is null, that Path call will error out, perhaps due to an NPE. I'm going to add loading the password to that code and then check that both are non-null. If either is null, I'll throw an IllegalStateException as is thrown further down in the code if the trustManagerFactory fails to initialize.

Hank

james-d-brown commented 1 week ago

It makes no sense to have a method that takes a cert (stream) as a parameter and to then attempt to acquire or define the password for that cert within the method. I am merely pointing out that fact. As to where the cert comes from, you will need to track that down (all contexts) and refactor accordingly so that there isn't a stream passed without a password (e.g., read the stream where it can be known that no password is required).

HankHerr-NOAA commented 1 week ago

I will track down the contexts. I'm also going to test calling load with a null password, since the Javadoc indicates that's allowed. Again, from what I can tell, the thing being in has no password associated with it, and the resulting truststore never leaves the code, so I see no reason to password protect it if its not required.

Hank

james-d-brown commented 1 week ago

If it isn't coming from an external source (including the classpath), I agree.

HankHerr-NOAA commented 1 week ago

SSLStuffThatTrustsOneCertificate is used in two contexts: (1) obtaining data and thresholds from WRDS, and (2) talking to the Postgres database. The Postgres cert can come from the classpath, which is often the case when run as a standalone, but typically traces back to the .env in cluster mode. The WRDS cert is loaded from a file whose name is traced back to the .env file.

So, yes, the source of the certificate is external, either file or classpath, but the source does not have a password associated with it. SSLStuffThatTrustsOneCertificate is creating a trust store to use internally (only) and assigning that trust store an arbitrary, otherwise unused password.

I still lean toward not using a password for that internal truststore (I would have to test if null works, of course). However, if there are strong feelings that the password should be passed along, I can make that change. It just feels like it would be coding without any benefit, and would have the negative of requiring yet more system properties/environment variables to specify at run time.

Hank

james-d-brown commented 1 week ago

Again, a method that takes a cert cannot make that decision because a cert and a cert password is a pairing. One without the other makes no sense. Accepting null if there is none or none is needed makes sense. Also, encapsulating the process of acquiring a cert from a source whose password is empty and then not supplying a password also makes sense. What exists now doesn't really make sense, even as a private method - it makes it harder for other devs to unravel.

HankHerr-NOAA commented 1 week ago

I'm testing my changes to BrokerHelper.java in the -dev COWRES now, in case anyone is impacted.

Hank

HankHerr-NOAA commented 1 week ago

After custom building my compose for the entry machine, I deployed without updating the .env and received the error I expected in the tasker and worker; for example, in the tasker:

tasker_1        | Exception in thread "main" java.lang.ExceptionInInitializerError
tasker_1        |       at wres.tasker.Tasker.main(Tasker.java:134)
tasker_1        | Caused by: java.lang.IllegalStateException: Both wres.trustStore and wres.trustStorePassword must be specified to initialize the trust manager.
tasker_1        |       at wres.messages.BrokerHelper.getDefaultTrustManager(BrokerHelper.java:165)
tasker_1        |       at wres.messages.BrokerHelper.getSSLContextWithClientCertificate(BrokerHelper.java:355)
tasker_1        |       at wres.tasker.WresJob.initializeBrokerConnectionFactory(WresJob.java:1006)
tasker_1        |       at wres.tasker.WresJob.<clinit>(WresJob.java:144)
tasker_1        |       ... 1 more

I then added the password to the .env, WRES_TRUST_STORE_PASSWORD, and it worked. A smoke test passed.

The events broker change is also clear cut, so I'm working on that one next. Per the conversation with James, the change to SSLStuffThatTrustsOneCertificate may be a bit broader than the other changes, so I'll do that one last.

Oh, and Evan mentioned a unit test that may be impacted by the "changeit" changes, so I'll look for that.

Hank

HankHerr-NOAA commented 1 week ago

Here are my proposed events broker system properties as referenced in boostrap.xml:

                keyStorePath="${eventsbroker.keystore.path}" 
                keyStorePassword="${eventsbroker.keystore.password}" 
                trustStorePath="${eventsbroker.truststore.path}" 
                trustStorePassword="${eventsbroker.truststore.passphrase}"

The environment variables will have the same names, but in all caps and with underscores instead of periods. Thoughts?

Hank

james-d-brown commented 1 week ago

Sounds good.

HankHerr-NOAA commented 1 week ago

Cool. A couple changes... Passphrase will be changed to password, to make it consistent with the argument name. I'll use broker instead of eventsbroker to make it internally consistent with other parameters/options. The only point of confusion could come in the .env, as mentioned earlier, and I'll try to comment to make that clear.

With that in mind, here is the change to the docker entrypoint script:

ARTEMIS_CLUSTER_PROPS="-Dactivemq.remoting.amqp.port=${BROKER_AMQP_PORT} -Dactivemq.remoting.http.port=${BROKER_HTTP_PORT} -Dbroker.keystore.path=${BROKER_KEYSTORE_PATH} -Dbroker.keystore.password=${BROKER_KEYSTORE_PASSWORD} -Dbroker.truststore.path=${BROKER_TRUSTSTORE_PATH} -Dbroker.truststore.passphrase=${BROKER_TRUSTSTORE_PASSWORD} -Dhawtio.disableProxy=true -Dhawtio.realm=activemq-cert -Dhawtio.role=wres-eventsbroker-admin -Dhawtio.offline=true -Dhawtio.sessionTimeout=86400 -Dhawtio.rolePrincipalClasses=org.apache.activemq.artemis.spi.core.security.jaas.RolePrincipal"

Of note, the environment variable BROKER_AMQP_PORT is set in the Dockerfile. I plan to set the new 4 environment variables in .env and then include them in the container environment via the compose file environment for the events broker container. If that doesn't sound right, please let me know.

Thanks,

Hank

HankHerr-NOAA commented 1 week ago

I made the bootstrap, docker entry, and compose template changes and deployed to -dev without updating .env. The events broker failed to come up, as expected; an example message:

eventsbroker_1  | 2024-09-04 15:14:09,049 WARN  [org.apache.activemq.artemis.core.client] AMQ212037: AMQP connection failure to 172.19.254.72:51640 has been detected: null [code=REMOTE_DISCONNECT]

I then added the 4 new environment variables with values matching those previous hardcoded. Unfortunately, the password check for the keystore is failing. Need to figure out why,

Hank

HankHerr-NOAA commented 1 week ago

To anyone who has the answer...

I modified the events broker docker entrypoint script to include this echo:

echo "HANK ####>> $ARTEMIS_CLUSTER_PROPS"

Where should I see that echo? The entry point script is supposed to be run when the container starts, so it should appear at/near the top of the Docker logs for the events broker container, right?

What funny is, after I made that change, I built the events broker image and pushed it to the registry, but did not see any layers getting updated when I ran the docker push. Hmmm....

Hank

HankHerr-NOAA commented 1 week ago

Figured it out. There was a typo in the docker commands I had to manually run to tag and push the image.

Hank

HankHerr-NOAA commented 1 week ago

No idea why I can't see my echo statement in any logs, and no idea why the password is not working. Eating lunch.

Hank

HankHerr-NOAA commented 1 week ago

It was another typo. Sheesh. I managed to get my echo into the script, brought up the containers, and confirmed that the system properties look good:

-Dbroker.keystore.path=/wres_secrets/wres-eventsbroker_server_keystore.p12 -Dbroker.keystore.password=wres-eve
ntsbroker-passphrase -Dbroker.truststore.path=/wres_secrets/wres-eventsbroker_server_truststore.p12 -Dbroker.truststore.passphrase=wres-eventsbroker-passphrase

From the original bootstrap.xml, I see this:

                keyStorePath="/wres_secrets/wres-eventsbroker_server_keystore.p12" 
                keyStorePassword="wres-eventsbroker-passphrase" 
                trustStorePath="/wres_secrets/wres-eventsbroker_server_truststore.p12" 
                trustStorePassword="wres-eventsbroker-passphrase" includedTLSProtocols="TLSv1.2,TLSv1.3">

They match. Yet the password is being rejected. Thinking that perhaps the hyphens in the the values of the password were causing issues, I quoted the passwords, and that didn't help.

At this point, I'm stumped as to why the passwords are not working when coming in via system properties. I'm eating lunch now, but will continue to investigate as time allows.

Hank

james-d-brown commented 1 week ago

There is some inconsistency between your use of password and passphrase in these system properties. I would fix that. Are you sure you named the system properties correctly in the bootstrap.xml?

HankHerr-NOAA commented 1 week ago

Thanks for forcing me to take a closer look at the bootstrap.xml. I had looked at it many times, but my brain was just skipping over a typo in one of the environment variables repeatedly. Now, with lunch and your message, I spotted the mistake. When I fixed it, I was able to bring up the service and post a smoke test that included graphics output.

Here is a snippet from bootstrap.xml that works:

   <web path="web">
       <binding uri="https://0.0.0.0:${activemq.remoting.http.port}" 
                clientAuth="true" 
                keyStorePath="${broker.keystore.path}" 
                keyStorePassword="${broker.keystore.password}" 
                trustStorePath="${broker.truststore.path}" 
                trustStorePassword="${broker.truststore.password}" 
                includedTLSProtocols="TLSv1.2,TLSv1.3">
           <app url="activemq-branding" war="activemq-branding.war"/>
           <app url="artemis-plugin" war="artemis-plugin.war"/>
           <app url="console" war="console.war"/>
       </binding>
   </web>

It consistently uses "password". The pertinent line in the entry point script that works is this:

ARTEMIS_CLUSTER_PROPS="-Dactivemq.remoting.amqp.port=${BROKER_AMQP_PORT} -Dactivemq.remoting.http.port=${BROKER_HTTP_PORT} -Dbroker.keystore.path=${BROKER_KEYSTORE_PATH} -Dbroker.keystore.password=${BROKER_KEYSTORE_PASSWORD} -Dbroker.truststore.path=${BROKER_TRUSTSTORE_PATH} -Dbroker.truststore.password=${BROKER_TRUSTSTORE_PASSWORD} -Dhawtio.disableProxy=true -Dhawtio.realm=activemq-cert -Dhawtio.role=wres-eventsbroker-admin -Dhawtio.offline=true -Dhawtio.sessionTimeout=86400 -Dhawtio.rolePrincipalClasses=org.apache.activemq.artemis.spi.core.security.jaas.RolePrincipal"

I then added these environment variables in the compose YAML:

        environment:
         - BROKER_WORK=/container_home
         - BROKER_KEYSTORE_PATH=${EVENTSBROKER_KEYSTORE_PATH}
         - BROKER_KEYSTORE_PASSWORD=${EVENTSBROKER_KEYSTORE_PASSWORD}
         - BROKER_TRUSTSTORE_PATH=${EVENTSBROKER_TRUSTSTORE_PATH}
         - BROKER_TRUSTSTORE_PASSWORD=${EVENTSBROKER_TRUSTSTORE_PASSWORD}

Note the use of "EVENTSBROKER" in the .env. I think that allows for sufficiently marking those variables as being for the events broker within that file (which spans all components), while, within the events broker, just "BROKER" (or "broker") is consistently used.

Reasonable?

Hank

HankHerr-NOAA commented 1 week ago

I just did a local commit and was able to deploy successfully specifying the appropriate revisions at the command line. Yay!

Now its time to look at SSLStuffThatTrustsOneCertificate. I think the goal is to (1) see if a null password is even an option. If so, (2) change the code so that the password is an optional argument to the constructors (it uses constructors at this point), and, if not specified, a null password is used for the trust store. That will allow the call to define the password, if its needed.

But, first, let me see if I can use a null password and have it succeed.

Hank

epag commented 1 week ago

Currently the trust store sets a password so its unlikely that it will work. Would probably need to create a trust store with no pass word for that scenario to have a possibility of working

james-d-brown commented 1 week ago

Reasonable?

Hank

Yes, sounds good.

HankHerr-NOAA commented 1 week ago

Evan:

The code SSLStuffThatTrustsOneCertificate is only being used currently to take a .pem file and wrap it in a trust store for internal use. It is NOT being used to access any trust store that exists as a flat file somewhere.

For example, SSLStuffThatTrustsOneCertificate is used to prepare a trust store for accessing WRDS as well as access the Postgres database. I just ran an evaluation that uses the WRDS RFC forecast service and the threshold service, at it succeeded. That was using null for the password to the internal trust store:

        try
        {
            customTrustStore.load( null,
                                   null );
        }
        catch ( IOException | NoSuchAlgorithmException | CertificateException e )
        {
            throw new IllegalStateException( "WRES could not create a custom trust store", e );
        }

That's in getTrustManagerWithOneAuthority. I'm going to make absolutely sure I've identified all contexts in which the code is called, so that I can be sure to test them.

Once I'm confident null works, then I can modify the code to allow for the caller to specify a custom password or leave it without a password.

hank

HankHerr-NOAA commented 1 week ago

SSLStuffThatTrustsOneCertificate is only initialized through one of its constructors with get methods called after. The other non-get methods are all private. Using Eclipse, those two constructors are called in wres.reading.ReaderUtilities.fillThresholds(...), wres.reading.getSslContextTrustingDodSignerForWrds(), and wres.system.PgSSLSSocketFactory (constructor).

Note that I had assumed that the class is used for Postgres access, but I think I'm wrong. The calls in wres.system.PgSSLSSocketFactory are all within the constructor, and that constructor appears to never be used... at least according to Eclipse.

I suppose there is a chance that Eclipse is wrong here, so if someone spots something fishy, let me know. I need to hop into another meeting.

Hank

HankHerr-NOAA commented 1 week ago

So, I added the password to the constructor in SSLStuffThatTrustsOneCertificate, with the Javadoc indicating it can be null if there is to be no password for the internal trust store.

Just to reiterate, that password is NOT being used to read in the certificate from a file. So, if that certificate is somehow password protected, this code change would not help with that. Rather, it assumes the InputStream that it is provided can be read as is to obtain the certificate, which will then be placed in an internal trust store and will use the password provided in the constructor.

I've tested the code using a null password for the internal trust store of the WRDS CA .pem. It seems to work. I'll now assign a password in the WRDS-related code that calls it.

If any of this doesn't make sense, please let me know.

Hank

HankHerr-NOAA commented 1 week ago

Using a fixed password works.

Looking at the .env, the current WRDS CA .pem is specified as this:

WRDS_CA_FILE=/wres_secrets/DODSWCA_60.pem

The environment variable for the internal trust store password would be something like, WRDS_CA_INTERNAL_TRUST_STORE_PASSWORD, though that may be a bit wordy. The system property would be, `-Dwrds.ca.internal.truststore.password' or something like that. I need to step away, but I'll think about ways to shorten it.

That is, assuming we still want to do that. Again, the internal trust store password would serve no purpose to anyone or anything outside of the WRES code base. So I'm tempted to actually let the system property exist, in case there is a purpose for it in the future, but then not use it in practice (i.e., leave it undefined).

Hank

HankHerr-NOAA commented 1 week ago

I've added an optional system property next to where the other WRDS property is defined:

        String pathToTrustFile = System.getProperty( "wres.wrdsCertificateFileToTrust" );
        String passwordForInternalTrustStore = System.getProperty( "wres.wrdsInternalTrustStorePassword" );

That password is then used to construct the internal trust store for access WRDS from either a CA on a flat file or on the class path:

            try ( InputStream trustStream = Files.newInputStream( path ) )
            {   
                SSLStuffThatTrustsOneCertificate sslGoo =
                        new SSLStuffThatTrustsOneCertificate( trustStream, passwordForInternalTrustStore );
                return Pair.of( sslGoo.getSSLContext(), sslGoo.getTrustManager() );
            }
...
            SSLStuffThatTrustsOneCertificate sslGoo =
                    new SSLStuffThatTrustsOneCertificate( inputStream, passwordForInternalTrustStore );
            return Pair.of( sslGoo.getSSLContext(), sslGoo.getTrustManager() );
        }

The first call, above, is when loading the cert from a file, the second from the class path. If the property is not specified, then the string will be null and no password will be used.

For now, since I see no reason to lock down the internal trust store, I'm not going to use this property, though I have tested that it doesn't break anything in -dev. Whether or not the wres.wrdsInternalTrustStorePassword is specified in the Java options, it works.

I'll do some more testing tomorrow, summarize what was done, and then do the pull request either tomorrow or Friday.

Hank