ge-high-assurance / RACK

DARPA's Automated Rapid Certification of Software (ARCOS) project called Rapid Assurance Curation Kit (RACK)
BSD 3-Clause "New" or "Revised" License
20 stars 6 forks source link

Error in using generic ingestion #289

Closed AbhaMoitra closed 3 years ago

AbhaMoitra commented 3 years ago

I loaded the generic json ingest_REQUIREMENT.json. Then I added instance data to ingest_REQUIREMENT.csv (the data is basically what is the HighLevelRequirement data from Turnstile but I altered the identifier name) - and this data file is attached here with "txt" extension. I was able to import the data successfully but when I queried the ingest_REQUIREMENT.json, I got an error.

ingest_REQUIREMENT.csv.txt

cuddihyge commented 3 years ago

This might be a clue. I found this during random SemTK startup, NOT trying to run the query but it looks interesting.

Found this in Semtk:

Caused by: java.lang.Exception: Fuseki query returned empty response

Saw this on the Fuseki console:

Server     ERROR Exception in initialization: caught: Process ID 18268 can't open database at location C:\Users\200001934\apache-jena-fuseki-3.10.0\run\databases\fuseki_test_persistent\ because it is already locked by the process with PID 10192. TDB databases do not permit concurrent usage across JVMs so in order to prevent possible data corruption you cannot open this location from the JVM that does not own the lock for the dataset
[2021-02-05 10:42:23] WebAppContext WARN  Failed startup of context o.e.j.w.WebAppContext@5fac521d{Apache Jena Fuseki Server,/,file:///C:/Users/200001934/apache-jena-fuseki-3.10.0/webapp/,UNAVAILABLE}
org.apache.jena.assembler.exceptions.AssemblerException: caught: Process ID 18268 can't open database at location C:\Users\200001934\apache-jena-fuseki-3.10.0\run\databases\fuseki_test_persistent\ because it is already locked by the process with PID 10192. TDB databases do not permit concurrent usage across JVMs so in order to prevent possible data corruption you cannot open this location from the JVM that does not own the lock for the dataset
  doing:
    root: http://base/#tdb_dataset_readwrite with type: http://jena.hpl.hp.com/2008/tdb#DatasetTDB assembler class: class org.apache.jena.tdb.assembler.DatasetAssemblerTDB
cuddihyge commented 3 years ago

The above error goes away when I

cuddihyge commented 3 years ago

I am unable to reproduce Abha's bug in a non-Docker development environment. I will need @tuxji to help. I think this is high priority because it is likely to be related to the problem Robert Stroud is reporting:

I have been working my way through some of the tutorial examples, but I have run into a problem - everything seems to be working until I run a query, at which point I get the unhelpful error message:

    Query failed with status: error Server response:
AbhaMoitra commented 3 years ago

@cuddihyge : let me know whenever this is ready for testing.

cuddihyge commented 3 years ago

I don't think we've made any progress on this one. I haven't reproduced it in my dev environment yet.

tuxji commented 3 years ago

@AbhaMoitra Can you list every step needed to reproduce your issue, starting from my running docker pull gehighassurance/rack-box:dev? I don't understand the steps "loaded the generic json ingest_REQUIREMENT.json", "imported ingest_REQUIREMENT.csv", and "queried the ingest_REQUIREMENT.json" in enough detail to be able to carry them out myself. What was supposed to happen and what error message did you get instead? If you would rather show me visually by sharing your screen with me, you can send me a meeting invite.

AbhaMoitra commented 3 years ago

@tuxji : some of the errors have now disappeared because updates by @cuddihyge . What is still giving me error is as follows

AbhaMoitra commented 3 years ago

I meant ingest_AVTIVITY.json -> ingest_ACTIVITY.json

cuddihyge commented 3 years ago

I have tried to reproduce this using the latest docker dev and I am not seeing an error:

image

tuxji commented 3 years ago

OK, your issue must be your machine's resources (available memory, CPU, and/or disk). I did the following steps and got 52 rows of results back without seeing any errors:

  1. Ran docker pull gehighassurance/rack-box:dev
  2. Ran docker run --detach -p 80:80 -p 12050-12092:12050-12092 gehighassurance/rack-box:dev
  3. Waited for all services to finish starting up (viewed logs in Docker Desktop until output had stopped for a while)
  4. Visited localhost in browser and clicked on SPARQLgraph link
  5. Found RACK/nodegroups/ingestion/ingest_ACTIVITY.json in Windows Explorer and dragged it into sparqlGraph pane
  6. Clicked Save button in dialog popup box
  7. Clicked Run button
  8. Got 52 rows and saw no errors in Docker Desktop logs pane

I see Paul had no problems too. By the way, this kind of detail (the steps above) is what I like to have in a bug report.

AbhaMoitra commented 3 years ago

I am not convinced that it is a resource issue on my machine. I think it is a misalignment with the ontology. Now I am using the dev version I pulled on 2/8/2021 so you might tell me to pull a new dev version but bear with me. If you look at the screen capture that Paul has above, there are 10 properties listed within the central ACTIVITY box. One of these is "performedBy" (I know it is very hard to see the details but it is the 4th from the bottom). However when I look in the ontology, performedBy is not a property of ACTIVITY or any ancestor of ACTIVITY.

I trimmed the json (and it still gives me an error) that we talking about and showing json and ontology as follows:

image

Now I have no idea why I am the only one of us three who is getting the error.

AbhaMoitra commented 3 years ago

@cuddihyge @tuxji : Ok here is a slightly different view that I think more clearly shows an issue. I pulled the dev branch today morning (2/11/2021) and in the following I am showing the ontology portion in SemTK.

image

The 2 (dangling) properties that appear after HAZARD_IDENTIFICATION are associated with ACTIVITY. I do not remember seeing this type of dangling structure before. And this relates to my previous comments. Anyway the point is that "performedBy" is not a property of ACTIVITY. You can check that in the GIT; but below is the result of searching for this property.

image

So, I still there is an issue. Note also that since performedBy is defined for ANALYSIS and that fragment is as follows. Note this property appears twice (once for it and once inherited from ACTIVITY).

image

cuddihyge commented 3 years ago

This is a simple sorting problem in the SparqlGraph pane. It is unrelated. I've fixed it.

AbhaMoitra commented 3 years ago

@tuxji @cuddihyge : So, with this fix, does performedBy show up as a property for ACTIVITY? I downloaded dev today (2/15) morning and I am unable to run it. I get the following error and the services do not start up.

image

AbhaMoitra commented 3 years ago

After updating my Docker (as suggested by John) I was able to pull and run dev docker today (2/17/2021) and everything seems fine. so I am going to push this issue to "Done" as it does not involve any further changes / updates.