Closed eapearson closed 2 years ago
This pull request introduces 1 alert when merging 3c80e9760f4ed061d8bbf53d38eda4569de4b4a9 into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
new alerts:
I realize the checks (tests) are failing, working on a GHA based test runner (have working Docker image config, just need to wire up the workflow.)
This pull request introduces 1 alert when merging 5f04f82f173ce1fb50031ba3eb4aa2a623904e14 into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
new alerts:
@scanon, sorry, I thought you might be pinged on this for review, so consider this the ping!
I realize that the new eap_scripts do not permanently save the results (as the other config transformation scripts do), and the directory should probably not be named eap_scripts. The original vision is that the sample_service could do that transformation on the fly. So some form of that code could live in the sample_service. But I think it might be better if, the sample_service, or whoever uses the config repo, after cloning the config repo, could simply run those scripts to generate the files. The generated files are the same ones used by the sample and sampleset landing pages, as well as the experimental/testing importer I've been using to exercise them against many SESAR samples.
This pull request introduces 1 alert when merging 2531bde9481c61d37ab35be7383ee1c355d98813 into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
new alerts:
@Shane I tried to address all of your comments.
The PR is on the big side, and the review comment UI is a bit finicky, so I'm not 100% sure I addressed all the questions or comments.
This pull request introduces 1 alert when merging 34eb3daf4e17286a5aa2420af19be91272ee52e9 into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
new alerts:
This pull request introduces 1 alert when merging 8d2e57dde7cecdcefac9a09438cca02000deae08 into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
new alerts:
This pull request introduces 1 alert when merging bc2be3cd7df025ad326238479ad7f6345272474b into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
new alerts:
This pull request introduces 2 alerts and fixes 3 when merging 0bd18bbf41e45d10e43bdf0eec4fb8a06fb5371e into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
new alerts:
fixed alerts:
This pull request introduces 2 alerts and fixes 3 when merging 089dea0ef94b721c9691030a2ec2b10020906138 into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
new alerts:
fixed alerts:
This pull request introduces 2 alerts and fixes 3 when merging 1a04aaf95d18b8306095de852944024a59d80df5 into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
new alerts:
fixed alerts:
This pull request introduces 2 alerts and fixes 3 when merging 0e3eda0a0a1ec6df32def24aaf1dda8a1da0d8d4 into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
new alerts:
fixed alerts:
This pull request introduces 1 alert and fixes 3 when merging 25834ba6418465cf123d6ea1abe6d461a1684e62 into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
new alerts:
fixed alerts:
This pull request introduces 1 alert and fixes 3 when merging e699f671500b78c32df9f46e2896196fceb70da3 into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
new alerts:
fixed alerts:
This pull request introduces 1 alert and fixes 3 when merging 8fc4c976236d4b0e3c638aa3ba588abab5629afd into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
new alerts:
fixed alerts:
This pull request introduces 1 alert and fixes 3 when merging d58a4ef999262d55863f89499238e051bdf08c40 into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
new alerts:
fixed alerts:
This pull request introduces 1 alert and fixes 3 when merging 48d9800789f62abf3a92d8c4da2d05f42414ec26 into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
new alerts:
fixed alerts:
This pull request introduces 1 alert and fixes 3 when merging a1280b1956a528f30f24eb6067a335be183af6be into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
new alerts:
fixed alerts:
This pull request introduces 1 alert and fixes 3 when merging e2d346aca5bdabd934f6061622abc54016cc9bf8 into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
new alerts:
fixed alerts:
This pull request introduces 1 alert and fixes 3 when merging 0aac5ab71830c5cb3d4103e02ac0d73afe15d20c into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
new alerts:
fixed alerts:
This pull request introduces 1 alert and fixes 3 when merging 7b552ddbbfd3c6ba467537459421d1f814d1110e into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
new alerts:
fixed alerts:
This pull request introduces 1 alert and fixes 3 when merging 825cb85cbec0afe5673088cbdf854ce2ffd2ab69 into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
new alerts:
fixed alerts:
This pull request introduces 1 alert and fixes 3 when merging eb8654f9f39f9b00aa002f58faed40ef3fec4369 into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
new alerts:
fixed alerts:
This pull request introduces 1 alert and fixes 3 when merging 9234f5da7673f61c89e901afb23b56e5816bd2a2 into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
new alerts:
fixed alerts:
This pull request introduces 1 alert and fixes 3 when merging b4754ef7d3edb897ce826f34a9b3a74dc12dc34d into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
new alerts:
fixed alerts:
This pull request introduces 1 alert and fixes 3 when merging fa3e5cc7fb39310c4df92d68e28c2072f21f689e into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
new alerts:
fixed alerts:
This pull request fixes 3 alerts when merging 2f448c74a2bda4105fabb43af01acc63b2980ea5 into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
fixed alerts:
This pull request fixes 9 alerts when merging c5bb1d664a36348eaf63eab4a1330d00be62222b into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
fixed alerts:
This pull request fixes 9 alerts when merging 473d9bde58645082ab4a0bced9a3c6749f19633b into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
fixed alerts:
This pull request fixes 9 alerts when merging 06022e3c3da06cf4b8f42ab34587d47ed501c36a into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
fixed alerts:
This pull request fixes 9 alerts when merging 3fd7427124deac118c94ea1a190b9c07779cc450 into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
fixed alerts:
This pull request fixes 9 alerts when merging f6c9ffcd19bea5b52e5c4160a896af1286fd708d into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
fixed alerts:
This pull request fixes 9 alerts when merging 7605cc6e2bd9e2cd0c3636305431e6674b06189f into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
fixed alerts:
This pull request fixes 9 alerts when merging 191bb4060a9c9ad897703f6750446e8054430b8e into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
fixed alerts:
This pull request fixes 9 alerts when merging 5136831fde6e4ac1189ce43954d03242bd7cb0cf into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
fixed alerts:
The last set of changes adds GHA workflow support for building the distributable assets - the metadata validation / definition files, the groupings, the templates, and an html page presenting the sample fields in the grouping order.
While doing this work, I also reorganized files and removed unused data, script and other files.
The files are published to a new set of branches, dist-master
, dist-release
, dist-pull_request-#
(where # is the PR #). In addition the release is tagged with the version dist-release-v#.#.#
.
Other than the new location for these files, the formats are the same. The ontology metadata file was removed since it is duplicative of the main metadata validation file and was not covered by a jsonschema (which is used for validation of file format.)
Documentation explaining this further has been added in the docs directory.
This pull request fixes 9 alerts when merging 52522bf6381eca72b7b14fab00839c6d8d364eed into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
fixed alerts:
@scanon I think it is time to finish this up. Since you last saw it, the GHA build has been added, which produces distribution branches. A new document describes this process. You can see my fork to see how it can be used, since the workflow won't be activated until it is merged. I could re-cast this as a direct branch in the repo, which would at least build the distribution for the PR. I could also add feature-branch activation of the workflow, which would stimulate the same pathway as a push to master.
This pull request fixes 9 alerts when merging 09e7df3521d1dd387efcc54692fde106f0408b4a into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
fixed alerts:
This pull request fixes 9 alerts when merging 3e551192f3a943987454cb7966b051d33ee3ee82 into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
fixed alerts:
This pull request introduces 1 alert and fixes 9 when merging f86bcb9b74629e94f9720c022d2747c6e5d0639e into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
new alerts:
fixed alerts:
This pull request introduces 1 alert and fixes 9 when merging a93dcc35801f990f21fb92e0b74b9996da522320 into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
new alerts:
fixed alerts:
This pull request introduces 1 alert and fixes 9 when merging 5ff6827884f498186e314b0bde2e8e6f496d19af into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
new alerts:
fixed alerts:
This pull request introduces 1 alert and fixes 9 when merging c394117b01844511534a80860238593f5886e968 into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
new alerts:
fixed alerts:
This pull request introduces 1 alert and fixes 9 when merging e5e76b2fbcc1e1e631429e61a77a50a219ada4e1 into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
new alerts:
fixed alerts:
This pull request introduces 1 alert and fixes 9 when merging 9beaf92bffd06ed5bf2f7125b262e092c8954362 into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
new alerts:
fixed alerts:
This pull request introduces 1 alert and fixes 9 when merging f47e69e53d76cf38d4153e8c28a27ef59e104dd2 into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
new alerts:
fixed alerts:
This pull request introduces 1 alert and fixes 9 when merging 9633cb6327ed2e7a2f20b5b1e412003053e2c8a9 into 701825e0627aad865d00256b05bc89f39cf7052d - view on LGTM.com
new alerts:
fixed alerts:
Summary of Changes
TLDR
This set of changes is oriented around adding support for the exporting of sample services specs into a less idiosyncratic, more ui-friendly sample configuration files. The files will be made available by new sample service methods which essentially consume and then emit those files.
The existing generated files and their location are retained for usage by the Sample Service and Sample Uploader.
This PR implements this through a new set of python and shell scripts, and a configuration distribution process based on GitHub Action workflows which create distrubution branches under various conditions.
The new config files support field ordering and grouping (
groups.json
), and field definitions (schema.json
), into json files more suitable for direct ui consumption. The grouping is a simple json object. The field definitions are jsonschema with KBase enhancements.The jsonshchema may be used validate samples, although no production code does or is planned to do so. However, the process of developing the jsonschema, as well as jsonschema to validate the jsonschema (!) revealed issues in the sample configuration that needed correction:
A few miscellaneous code quality and other fixes.
I've tested these changes, or at least the resulting output files, with a sample validation and import tool utilizing 10s of thousands of SESAR samples. At present these samples fully validate, with the exception of a handful of samples with true validation errors. They do not presently import into our systems due to the fact that some "unit" fields have been restored and are currently (outside of this PR) not configured to be present in samples. Thus, when placed in a sample's meta_controlled the sample service throws an error:
Danger
Some of these changes may be breaking:
templates
For both templates, moved the top level
Template
,ID
, andName
fields to the top of the file. It is simply easier to grasp the template with these fields at the top of the file, and the columns extending below them.For both templates, added the top level
Parent
field. We are not currently using this feature, but we do have existing mappings for this, so I've restored them. If we fail to capture this info those samples will never be able to form relationships.For both templates, field labels are considered to be compared case insensitively; any alias which duplicates the primary field label, or another alias, is removed; any empty aliases attribute is removed.
ENIGMA
No other changes
SESAR
add
apiKey
mandatory attribute; under certain conditions (bulk download) the downloaded samples are formatted as csv with column labels in the form of the api keys, which are snake-cased. The other form of download is Excel xlsx, which uses the human-friendly form of column labels.add
disposition
attribute, which contains a disposition symbol (string):required
,recommended
,ignore
.required
andrecommended
reflect the SESAR documentation.required
should be used to ensure that SESAR-required fields are present; I've found, though, that SESAR does not enforce this across all sample types. Imagine a parent sample for a site, with multiple child samples for actual specimens.recommended
could be used as a quality measurementignore
means that the field should be ignored and not imported; see the "URL" comment in "Thoughts" below.restore the fields
age_unit
andgeological_unit
, which are used to provide the unit for their associated value fields; if we don't store all fields uploaded, downloading sample spreadsheets which can be used for re-import is more difficult.restore the transformation for
Comment
, not sure why it was missing.added missing fields:
url
(and should also be ignored.)Ignoring Fields
The new template attribute "disposition" allows ignoring fields. Here are some examples of fields which should be ignored:
Cur Registrant Name
set as disposition "ignore". It is generated from the user's id stored with the sample at upload time. There are a few problems with this field:The "Related URL" fields are ignored. This includes:
These fields are supported for initial upload, but are not downloaded. They are available through the api as either "external_url" or "publicationUrl", each of which support multiple urls.
They are not available for download, probably because the internal structure is actually a set of related urls and css doesn't explicitly support a sub-table in a field. The spreadsheet upload allows a single related url.
It is notable that SESAR also supports uploading by XML, which does support more complex data than a flat table. E.g. it allows uploading of multiple related urls.
is_archived
: Appears to be another synthesized field; it is not present in the upload template.Export Scripts
Scripts were refactored. There are now three sets of scripts:
The automation scripts will place all built assets into the
dist
directory, which is excluded from git. For a local build, the dist directory will simply be populated. For a GHA build, the directory is populated, and then copied to a distribution branch.The new scripts which generate and validate the ui configs are located in
scripts/export/ui
, where you can find aREADME.md
describing each script.Existing Scripts
Existing scripts were only lightly touched, and simply relocated.
/scripts/validate/validate_templates.py
required adding'Parent'
as a required field; touching the file required some code style changes (automated)Vocabulary changes
A few changes were made to vocabularies. This includes:
Field Grouping / Ordering Changes
Changes consist of adding missing fields, and in one case, a missing group. The
scripts/export/ui/export.py
script checks for the consistency of groups and controlled field schemas, ensuring that each controlled field exists in the grouping config, and that the grouping config only contains controlled fields.Spec Distribution
A GitHub Action workflow creates distribution branches containing just the built sample service specs.
Please see the dedicated document docs/build-and-deploy.md for details.