Ingestion package loader use logging ERROR WARNING INFO

cuddihyge commented 1 year ago

[x] prepend progress log lines with ERROR WARNING INFO tags (pushed to RACK)
[x] provide Java methods to parse log level and message from a given log line (pushed to RACK)
[x] made a separate issue to consider any client changes

cuddihyge commented 1 year ago

note: remember to check ERROR. If there are multiple lines, the others might better be labelled DEBUG: or INFO: whatever

weisenje commented 1 year ago

@glguy, @chrisage: sample output below. Please give a thumbs-up if this format is fine.

INFO: Loading manifest for 'Entity Resolution'...
INFO: Clear graph http://junit/G7JZH4J3E/200005868/auto/rack001/model
INFO: Loading manifest for 'RACK ontology'...
INFO: Load OWL OwlModels\AGENTS.owl to http://junit/G7JZH4J3E/200005868/auto/rack001/model
INFO: Store nodegroups
INFO: Stored: JUNIT query Files of a Given Format
INFO: Stored: JUNIT query Requirements without Tests
INFO: Load CSV Package-1\PROV_S_ACTIVITY1.csv as class http://arcos.rack/PROV-S#ACTIVITY
WARNING: Input is missing these columns: endedattime, goal_identifier
INFO: Clear graph uri://DefaultGraph
INFO: Copy graph http://junit/G7JZH4J3E/200005868/auto/rack001/model to uri://DefaultGraph
INFO: Successfully copied http://junit/G7JZH4J3E/200005868/auto/rack001/model into urn:x-arq:DefaultGraph
INFO: Perform entity resolution in uri://DefaultGraph
INFO: Load complete

Or for an error condition:

ERROR: Cannot find a top-level manifest in resource14457517151095630833.zip

weisenje commented 1 year ago

@glguy, @chrisage: in addition to making the output look like above, @cuddihyge suggested that we could also modify the SemTK clients (Java and/or Python) to perform the logging (at a given level) and return a boolean representing whether the load succeeded. Shall I give that a try, or is there a reason not to?

glguy commented 1 year ago

I think this makes sense. For some reason we're instantiating our own logger instance at the top of the cli. I'm going to check with @Ptival to remember why. I'm just curious to see what mind happen if the semtk python3 bindings were to use the system logger and not this one.

Ptival commented 1 year ago

I think it might just have been as a way of attaching a custom formatter to our logs in this commit:

https://github.com/ge-high-assurance/RACK/commit/6cc8604733103c747d92c905c0a133799f43729c

I think there'd be nothing wrong with just using the system logger? The custom formatting was nice during development to easily distinguish errors from SemTK, errors from the CLI, etc. But if we want the output to be more uniform, I think the custom formatter can go away.

chrisage commented 1 year ago

@glguy, @chrisage: sample output below. Please give a thumbs-up if this format is fine.

INFO: Loading manifest for 'Entity Resolution'...
INFO: Clear graph http://junit/G7JZH4J3E/200005868/auto/rack001/model
INFO: Loading manifest for 'RACK ontology'...
INFO: Load OWL OwlModels\AGENTS.owl to http://junit/G7JZH4J3E/200005868/auto/rack001/model
INFO: Store nodegroups
INFO: Stored: JUNIT query Files of a Given Format
INFO: Stored: JUNIT query Requirements without Tests
INFO: Load CSV Package-1\PROV_S_ACTIVITY1.csv as class http://arcos.rack/PROV-S#ACTIVITY
WARNING: Input is missing these columns: endedattime, goal_identifier
INFO: Clear graph uri://DefaultGraph
INFO: Copy graph http://junit/G7JZH4J3E/200005868/auto/rack001/model to uri://DefaultGraph
INFO: Successfully copied http://junit/G7JZH4J3E/200005868/auto/rack001/model into urn:x-arq:DefaultGraph
INFO: Perform entity resolution in uri://DefaultGraph
INFO: Load complete

Or for an error condition:

ERROR: Cannot find a top-level manifest in resource14457517151095630833.zip

LGMT! Solves the problem of merging multiple output streams (ERR/INFO/WARNING/CRITICAL/DEBUG?) into one by prefixing the classification. As discussed, these statements are streamed real-time and can be amended by the client (e.g. prefixed with timestamps).

weisenje commented 1 year ago

@chrisage: for the Java client, my thought is to have the UtilityClient method accept a PrintStream. So the new signature would look something like this (new things in bold)

public boolean loadIngestionPackage(
     File ingestionPackageFile, 
     String serverAndPort, 
     String serverType, 
     boolean clear, 
     String defaultModelGraph,                                      
     String defaultDataGraph,                                   
     boolean showWarnings,                               
     PrintStream responseStream) throws Exception {

The responseStream would look like this (INFO: etc removed, and displaying warnings only if showWarnings is true):

Loading manifest for 'Entity Resolution'...
Clear graph http://junit/G7JZH4J3E/200005868/auto/rack001/model
Clear graph http://junit/G7JZH4J3E/200005868/auto/rack001/data
Loading manifest for 'RACK ontology'...
...

If any ERROR: was received, the method would return false indicating a failed load.

Does this work for you?

glguy commented 1 year ago

The downside here is that the client can't do client-specific things to make this rendering pretty. With the tags on lines I can do console color coding and in the RACK-UI you can do HTML coloring of output lines.

weisenje commented 1 year ago

Maybe the client should just return the tagged lines (including WARNINGs) for now (as shown in the original sample above), and if we see opportunity to make the client do something further with it at a later point, we can do it then.

weisenje commented 1 year ago

SemTK service endpoint now prepends INFO/WARNING/ERROR tags, like this:

INFO: Load CSV Package-1\PROV_S_ACTIVITY1.csv as class http://arcos.rack/PROV-S#ACTIVITY
WARNING: Input is missing these columns: endedattime, goal_identifier
ERROR: Cannot find a top-level manifest in resource14457517151095630833.zip

Also provided Java method on com.ge.research.semtk.utility.Logger to parse level and message out of a given log line:

Logger.getLevel(String s)
Logger.getMessage(String s)

Created a separate issue to consider any client improvements.

glguy commented 1 year ago

@weisenje Minor issue, but the space seems to be missing after DEBUG: to match the other messages.

weisenje commented 1 year ago

@glguy the endpoint does not (explicitly, at least) produce any DEBUG, only INFO/WARNING/ERROR. Can you show me a full line so we can get to the bottom of it?

glguy commented 1 year ago

OH, this debug output was not from the endpoint, then. I was just confused. Something else was generating the output. For reference I had this:

DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:12060

So there's no actual issue, sorry!

ge-high-assurance / RACK

Ingestion package loader use logging ERROR WARNING INFO #931