Closed jordanpadams closed 4 months ago
Modify Harvest to:
@jordanpadams @sjoshi-jpl
Is there a cognito tutorial for us somewhere? Just the readme of some repo maybe?
@alexdunnjpl @jordanpadams @tloubrieu-jpl
Looking for opinions. Currently, harvest splits up server information. In the harvest config, provide registry URL, index, and auth file. In auth file provide trust self signed and username/password. Trusting self signed is a server problem like the URL not auth information. Also, harvest uses a sax parser to read the config file making it difficult to understand what is legal and illegal in the file.
There are several theories of approach here:
So, least change would mean shoving clientID and urlIsCognito into auth file. Already had some discontent at last breakout over that. Not a huge fan of it myself because now you have repeated information for Nick and Nora who just care about their username/password and all the rest is meaningless. The auth file would look like:
trust.self-signed = true
clientID = a123bajkhfdkjalkjalj212lsdlkjsgjkl
urlIsCognito = false
# TODO Warning: Use the default username and password only for testing purposes in local setup
user = nickandnora
password = asta
Here, group the server information in harvest config and make auth just username/password:
user = nickandnora
password = asta
However this pushes three new items into the harvest config. Since it is a sax parser it will unduly complicate the code and provide no explanation of what is possible. In turn, all the examples would have to be expanded and expounded to cover the new options. It would looks something like:
<registry url="https://es:9200"
index="registry"
clientID=akjdsajfdkjaklfoiwi"
trust.self_signed="false"
urlIsCognito="true"
auth="auth.cfg" />
First, write a schema for harvest config. It would have docs to explain the options. It would allow for xmllint to determine if the config is valid and tell the user where the error is. Since the user base is PDS, they should be familiar enough with XML and schema to use it effectively.
Second, use JAXB and get out of the XML parsing business. Is much more secure with respect to injection and it also validates the XML as reading.
Lastly, auth file would be username/password only. The line would look more like:
<registry url="https://es:9200"
index="registry"
clientID=akjdsajfdkjaklfoiwi"
trust.self_signed="false"
style="direct" --> schema would limit to direct or cognito for now
auth="auth.cfg" />
Obviously I prefer the last. When I had to write my own harvest from scratch I had to read java code and was really annoyed that there was no schema. It will take a little bit of time (days) to do schema and stuff but will pay dividends over time.
Thoughts?
Firm default-vote for "right way".
@al-niessner +1 to "right way". We have gotten complaints about not having a schema file in the past for this. They have some schema generators online you can use to get a first pass at what that could be as well.
@al-niessner is looking at the xsd schema that requires update for the cognito configuration in the harvest configuration file.
The schema is now updated and is backward compatible. @tloubrieu-jpl can review the new schema available in the draft PR.
@al-niessner status: Got a response from Cognito team. Waiting for @sjoshi-jpl to make the happenings happen.
Task for @tloubrieu-jpl : write an updated documentation for Gary.
Task plan for @gxtchen :
@gxtchen as a note, after we met I also thought, we also need to think of how to automate the tests that you are going to write for harvest and registry-manager. That can be done in the docker compose set-up.
💡 Description