smallAreaHealthStatisticsUnit / rapidInquiryFacility

The Rapid Inquiry Facility (RIF) helps epidemiologists and public health researchers in environmental health activities.
GNU Lesser General Public License v3.0
14 stars 5 forks source link

Adds ICD 9 support #106

Closed devilgate closed 5 years ago

devilgate commented 5 years ago

This change adds support for ICD 9 via a CSV file. It also removes a lot of redundant code, including what appeared to be an almost complete duplicate of the ICD 10 service.

More visibly, I've changed the name of the Taxonomy Services WAR file, and correspondingly the URL of the service. The file is now taxonomies.war, and the URL is /taxonomies/service/.

I changed it because the repetition in /taxonomyServices/taxonomyServices/ annoyed me; but also because it feels better semantically now.

It also supports -- and documents -- the easy addition of other CSV-based taxonomies in future.

peterhambly commented 5 years ago

ICD9 functionality is OK. Have added CSS fix to restore select arrow functionality. Risk analysis changes cause a tomcat server crash for both disease mapping and risk analysis:

Rengine.eval(Rpid<-Sys.getpid()): BEGIN Thread[http-nio-8080-exec-3,5,main]
Rengine.eval(Rpid<-Sys.getpid()): END (OK)Thread[http-nio-8080-exec-3,5,main]
Rengine.eval(Rpid): BEGIN Thread[http-nio-8080-exec-3,5,main]
Rengine.eval(Rpid): END (OK)Thread[http-nio-8080-exec-3,5,main]
Rengine.eval(rm(list=ls())): BEGIN Thread[http-nio-8080-exec-3,5,main]
Rengine.eval(rm(list=ls())): END (OK)Thread[http-nio-8080-exec-3,5,main]
Rengine.eval(print(.libPaths())): BEGIN Thread[http-nio-8080-exec-3,5,main]
Rengine.eval(print(.libPaths())): END (OK)Thread[http-nio-8080-exec-3,5,main]
Rengine.eval(print(sessionInfo())): BEGIN Thread[http-nio-8080-exec-3,5,main]
Rengine.eval(print(sessionInfo())): END (OK)Thread[http-nio-8080-exec-3,5,main]
Rengine.eval(source('C:\Program Files\Apache Software Foundation\Tomcat 8.5\webapps\rifServices\WEB-INF\classes\OdbcHandler.R')): BEGIN Thread[http-nio-8080-exec-3,5,main]
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  EXCEPTION_GUARD_PAGE (0x80000001) at pc=0x00007ff937054ff8, pid=28484, tid=0x0000000000004880
#
# JRE version: Java(TM) SE Runtime Environment (8.0_162-b12) (build 1.8.0_162-b12)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.162-b12 mixed mode windows-amd64 compressed oops)
# Problematic frame:
# C  [ntdll.dll+0xa4ff8]
#
# Failed to write core dump. Minidumps are not enabled by default on client versions of Windows
#
# An error report file with more information is saved as:
# C:\Program Files\Apache Software Foundation\Tomcat 8.5\bin\hs_err_pid28484.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

hs_err_pid28484.log

Middleware log:

Rengine Started; Rpid: 28484; JRI version: 266; thread ID: 21
09:20:10.347 [http-nio-8080-exec-3] INFO  org.sahsu.rif.generic.util.RIFLogger : [org.sahsu.rif.services.datastorage.common.StatisticsProcessing]:
R parameters: 
userID=peter
password=XXXXXXXX
odbcDataSource=SQLServer13
db_driver_prefix=jdbc:sqlserver
db_host=localhost
db_port=1433
db_name=sahsuland
db_driver_class_name=com.microsoft.sqlserver.jdbc.SQLServerDriver
db_url=jdbc:sqlserver://localhost:1433
java_lib_path_dir=C:\Program Files\Apache Software Foundation\Tomcat 8.5\webapps\rifServices\WEB-INF\lib
odbcDataSource=SQLServer13
studyID=138
investigationName=TEST_1002
names.adj.1=NONE
adj.1=FALSE
studyName=TEST 1002
studyDescription=TEST 1002 LUNG CANCER HET 95_96
investigationId=133
studyType=diseaseMapping
model=HET

09:20:10.348 [http-nio-8080-exec-3] INFO  org.sahsu.rif.generic.util.RIFLogger : [org.sahsu.rif.services.system.RIFServiceStartupOptions]:
RIFServiceStartupOptions is web deployment
09:20:10.352 [http-nio-8080-exec-3] INFO  org.sahsu.rif.generic.util.RIFLogger : [org.sahsu.rif.services.system.RIFServiceStartupOptions]:
Print java.library.path:
[0] C:\Program Files\Java\jdk1.8.0_162\bin;
[1] C:\WINDOWS\Sun\Java\bin;
[2] C:\WINDOWS\system32;
[3] C:\WINDOWS;
[4] C:\Program Files (x86)\Intel\Intel(R) Management Engine Components\iCLS\;
[5] C:\Program Files\Intel\Intel(R) Management Engine Components\iCLS\;
[6] C:\WINDOWS\system32;
[7] C:\WINDOWS;
[8] C:\WINDOWS\System32\Wbem;
[9] C:\WINDOWS\System32\WindowsPowerShell\v1.0\;
[10] C:\Program Files (x86)\Intel\Intel(R) Management Engine Components\DAL;
[11] C:\Program Files\Intel\Intel(R) Management Engine Components\DAL;
[12] C:\Program Files (x86)\Intel\Intel(R) Management Engine Components\IPT;
[13] C:\Program Files\Intel\Intel(R) Management Engine Components\IPT;
[14] C:\Program Files\PostgreSQL\9.6\bin;
[15] C:\Program Files\Java\jdk1.8.0_162\bin;
[16] C:\Program Files\Apache Software Foundation\apache-maven-3.5.3\bin;
[17] C:\Program Files\Microsoft SQL Server\Client SDK\ODBC\130\Tools\Binn\;
[18] C:\Program Files (x86)\Microsoft SQL Server\130\Tools\Binn\;
[19] C:\Program Files\Microsoft SQL Server\130\Tools\Binn\;
[20] C:\Program Files\Microsoft SQL Server\130\DTS\Binn\;
[21] C:\Program Files (x86)\Microsoft SQL Server\Client SDK\ODBC\130\Tools\Binn\;
[22] C:\Program Files (x86)\Microsoft SQL Server\140\Tools\Binn\;
[23] C:\Program Files (x86)\Microsoft SQL Server\140\DTS\Binn\;
[24] C:\Program Files (x86)\Microsoft SQL Server\140\Tools\Binn\ManagementStudio\;
[25] C:\Program Files\nodejs\;
[26] C:\Program Files\dotnet\;
[27] C:\MinGW\msys\1.0\bin;
[28] C:\Program Files\Apache Software Foundation\Tomcat 8.5\bin;
[29] C:\Program Files\R\R-3.4.4\bin\x64;
[30] C:\Program Files\R\R-3.4.4\library\rJava\jri\x64;
[31] C:\Program Files\MiKTeX 2.9\miktex\bin\x64\;
[32] C:\Python27;
[33] C:\WINDOWS\System32\OpenSSH\;
[34] C:\Program Files\Git\cmd;
[35] C:\Python27\Scripts;
[36] C:\Users\admin\AppData\Local\Microsoft\WindowsApps;
[37] .;

09:20:10.353 [http-nio-8080-exec-3] INFO  org.sahsu.rif.generic.util.RIFLogger : [org.sahsu.rif.services.system.RIFServiceStartupOptions]:
Returning path: C:\Program Files\Apache Software Foundation\Tomcat 8.5\webapps\rifServices\WEB-INF\classes
09:20:10.353 [http-nio-8080-exec-3] INFO  org.sahsu.rif.generic.util.RIFLogger : [org.sahsu.rif.services.datastorage.common.StatisticsProcessing]:
Source: 'C:\Program Files\Apache Software Foundation\Tomcat 8.5\webapps\rifServices\WEB-INF\classes\OdbcHandler.R'

The crash did NOT re-occur when I reverted to master. OdbcHandler.R must be a suspect. I was using SQL Server.

Note that the ICD search box does a full wildcard (i.e. %162%) search. This returns more rows than a user might expecd (e.g image) capture

Needed to be built twice, first time I got errors:

This is a problem with mvn install not building taxonomies.war correctly. The taxonomyservice make target works OK. Care needs to be taken with existing systems with full ICD10 codes as claMLTaxonomyService is (correctly) capitalised ClaMLTaxonomyService in the new version.

taxonomyservice:    
    $(MAVEN) --version
    cd rifGenericLibrary && $(MAVEN) $(MAVEN_FLAGS) install
    cd taxonomyServices && $(MAVEN) $(MAVEN_FLAGS) install
    $(COPY) taxonomyServices/target/taxonomyServices.war .
!!!!!!!!!!!!!!!!!!!!! RIFTaxonomyWebServiceApplication !!!!!!
08:31:05.957 [http-nio-8080-exec-4] INFO  org.sahsu.rif.generic.util.TaxonomyLogger : [org.sahsu.rif.generic.taxonomyservices.TaxonomyServiceConfigurationXMLReader]:
TaxonomyService configuration file: C:\Program Files\Apache Software Foundation\Tomcat 8.5\conf\TaxonomyServicesConfiguration.xml
08:31:05.972 [http-nio-8080-exec-4] ERROR org.sahsu.rif.generic.util.TaxonomyLogger : [org.sahsu.rif.generic.taxonomyservices.TaxonomyServiceConfigurationXMLReader]:
Exception initializing taxonomyService: org.sahsu.taxonomyservices.claMLTaxonomyService
getMessage:          ClassNotFoundException: org.sahsu.taxonomyservices.claMLTaxonomyService
getRootCauseMessage: ClassNotFoundException: org.sahsu.taxonomyservices.claMLTaxonomyService
getThrowableCount:   1
getRootCauseStackTrace >>>
java.lang.ClassNotFoundException: org.sahsu.taxonomyservices.claMLTaxonomyService
    at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1291)
    at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1119)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:264)
    at org.sahsu.rif.generic.taxonomyservices.TaxonomyServiceConfigurationXMLReader.readFile(TaxonomyServiceConfigurationXMLReader.java:118)
    at org.sahsu.rif.generic.taxonomyservices.FederatedTaxonomyService.initialise(FederatedTaxonomyService.java:57)
    at org.sahsu.taxonomyservices.RIFTaxonomyWebServiceResource.initialiseService(RIFTaxonomyWebServiceResource.java:48)
devilgate commented 5 years ago

Hi Peter,

First, sorry I missed the makefile change. And I've also noticed one place where I missed the URL change (the zipfile export). So I'm working on fixing that.

Now to take your points, from least to most severe.

Build problem

You said, "This is a problem with mvn install not building taxonomies.war correctly." Did you include a clean? mvn clean install? Because that should prevent any problems like that. Also the business with the renamed class shouldn't happen with a full clean build.

Search showing too much

I haven't changed the way the search works. It does a full wildcard search in the text field, and if that doesn't return anything, it does the same in the label field. What result would you expect from specifying a partial label? It would be easy enough to change it to, for example, treat the search string as a prefix. But I worry that then we would be taking functionality away.

Crash in ODBC access

You said, "Risk analysis changes cause a tomcat server crash." How do you know it was caused by the RA changes? Aren't they already in master?

And from the error ("The crash happened outside the Java Virtual Machine in native code") I don't have any idea how to begin debugging it. As you'd expect I don't get it on the Mac. OdbcHandler.R hasn't changed, so I don't understand why it would suddenly happen now. I'll try it on Windows, but I don't have the DB on SQL Server set up.

One thing I did notice in the above output, though:

odbcDataSource=SQLServer13
db_driver_prefix=jdbc:sqlserver

Could that driver prefix be causing the problem? Similarly db_url=jdbc:sqlserver://localhost:1433. But I guess maybe it doesn't use those values for ODBC.

Other than that, if you've got any ideas for how I should proceed, please let me know.

peterhambly commented 5 years ago

Hi Martin,

peterhambly commented 5 years ago

This bug replicates for Postgres on Windows. It is caused by Windows R, probably rJava. Can you tell me which version of R/JRI you are using? I have R 3.4.0 and rJava 0.9-8. The latest is 3.5.1/0.9-10

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j  org.rosuda.JRI.Rengine.rniParse(Ljava/lang/String;I)J+0
j  org.rosuda.JRI.Rengine.eval(Ljava/lang/String;Z)Lorg/rosuda/JRI/REXP;+53
j  org.rosuda.JRI.Rengine.eval(Ljava/lang/String;)Lorg/rosuda/JRI/REXP;+3
j  org.sahsu.rif.services.datastorage.common.CommonRService.sourceRScript(Lorg/rosuda/JRI/Rengine;Ljava/nio/file/Path;)V+76
j  org.sahsu.rif.services.datastorage.common.StatisticsProcessing.performStep(Ljava/sql/Connection;Lorg/sahsu/rif/services/concepts/RIFStudySubmission;Ljava/lang/String;)V+810
j  org.sahsu.rif.services.datastorage.common.RunStudyThread.smoothResults()V+16
j  org.sahsu.rif.services.datastorage.common.RunStudyThread.run()V+307
j  java.lang.Thread.run()V+11
j  org.sahsu.rif.services.datastorage.common.StudySubmissionService.submitStudy(Lorg/sahsu/rif/generic/concepts/User;Lorg/sahsu/rif/services/concepts/RIFStudySubmission;Ljava/io/File;)Ljava/lang/String;+218
j  org.sahsu.rif.services.rest.WebService.submitStudy(Ljavax/servlet/http/HttpServletRequest;Ljava/lang/String;Ljava/lang/String;Ljava/io/InputStream;)Ljavax/ws/rs/core/Response;+278
j  org.sahsu.rif.services.rest.StudySubmissionServiceResource.submitStudy(Ljavax/servlet/http/HttpServletRequest;Ljava/lang/String;Ljava/lang/String;Ljava/io/InputStream;)Ljavax/ws/rs/core/Response;+6
v  ~StubRoutines::call_stub
devilgate commented 5 years ago

R version 3.5.1 (2018-07-02) -- "Feather Spray"

rJava: 0.9-10

devilgate commented 5 years ago

Hopefully upgrading will make your problem go away.

I'm still trying to close the can of worms I opened by fixing the URL change I mentioned missing out, above. It will make the whole thing better, eventually, though.

peterhambly commented 5 years ago

OK, the bad news is upgrading to (Postgres) R 3.5.1 and rJava: 0.9-10 (completely removing all old R, checking and R_HOME are correct) has no effect, same crash sourcing the first R script (JdbcHandler.R, OdbcHandler.R on SQL Server) in:

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j  org.rosuda.JRI.Rengine.rniParse(Ljava/lang/String;I)J+0
j  org.rosuda.JRI.Rengine.eval(Ljava/lang/String;Z)Lorg/rosuda/JRI/REXP;+53
j  org.rosuda.JRI.Rengine.eval(Ljava/lang/String;)Lorg/rosuda/JRI/REXP;+3
j  org.sahsu.rif.services.datastorage.common.CommonRService.sourceRScript(Lorg/rosuda/JRI/Rengine;Ljava/nio/file/Path;)V+76

hs_err_pid5796.log

R works properly in #risk-analysis-fixes-required-enhancement (with no middleware changes)

peterhambly commented 5 years ago

OK: I have spotted the following likely cause:

Crashes:

09:45:03.412 [http-nio-8080-exec-10] INFO  org.sahsu.rif.generic.util.RIFLogger : [org.sahsu.rif.services.datastorage.common.StatisticsProcessing]:
Source: 'C:\Program Files\Apache Software Foundation\Tomcat 8.5\webapps\rifServices\WEB-INF\classes\JdbcHandler.R

Does not crash:

11:02:04.266 [http-nio-8080-exec-9] INFO  org.sahsu.rif.generic.util.RIFLogger : [org.sahsu.rif.services.datastorage.common.SmoothResultsSubmissionStep]:
Source(\): 'C:\\Program Files\\Apache Software Foundation\\Tomcat 8.5\\webapps\\rifServices\\WEB-INF\\classes\\JdbcHandler.R
devilgate commented 5 years ago

Looks promising, but nothing should include any specific paths, now. It should all use the TomcatFile classes I wrote, which internally use Java Path objects.

devilgate commented 5 years ago

Although it does all depend on this:

Path scriptPath = FileSystems.getDefault().getPath(
                        rifStartupOptions.getClassesDirectory());

So it relies on what's in the startup options file.

The above is from StatisticsProcessing.java.

devilgate commented 5 years ago

Actually, no: that does use a TomcatFile under the covers. Trouble is, it returns it as a String. Where are you getting that path from?

peterhambly commented 5 years ago

CommonRService.java in my branch contains escaping for Windows. I suspect this is the fault

    // Source R script
    @Override
    public void sourceRScript(Rengine rengine, String scriptName) 
        throws Exception {

        File rScript=new File(scriptName);
        if (rScript.exists()) {
            String nScriptName=scriptName;
            if (File.separatorChar == '\\') { // Windooze!!! R path strings need to be escaped; they must go through a shell 
                                              // like runtime at some point
                nScriptName=scriptName.replace("\\","\\\\");
                rifLogger.info(this.getClass(), "Source(" + File.separator + "): '" + nScriptName + "'");
            }
            else {
                rifLogger.info(this.getClass(), "Source: '" + nScriptName + "'");
            }
            rengine.eval("source('" + nScriptName + "')");
            rifLogger.info(this.getClass(), "Done: '" + nScriptName + "'");
        }
        else {
            throw new Exception("Cannot find R script: '" + scriptName + "'");
        }
    }
devilgate commented 5 years ago

That's a very old version.

    @Override
    public void sourceRScript(Rengine rengine, Path script)
        throws Exception {

        if (script.toFile().exists()) {
            rifLogger.info(this.getClass(), "Source: '" + script + "'");
            rengine.eval("source('" + script.toString() + "')");
            rifLogger.info(this.getClass(), "Done: '" + script + "'");
        }
        else {
            throw new Exception("Cannot find R script: '" + script + "'");
        }
    }
peterhambly commented 5 years ago

I have fixed the error handlers. Now runs with an R error: ERROR: $ operator is invalid for atomic vectors

Statistics_JRI.R errorTrace: >>>
Trying JDBC connection using driver class org.postgresql.Driver, lib dir C:\Program Files\Apache Software Foundation\Tomcat 8.5\webapps\rifServices\WEB-INF\lib, URL jdbc:postgresql://localhost:5432/sahsuland, user peter
... JDBC connection established 
Querying by JDBC: SELECT rif40_sql_pkg.rif40_startup() 
Connected to DB
connectToDb exitValue: 0
About to fetch extract table
JDBC EXTRACT TABLE NAME: rif_studies.s534_extract
Querying by JDBC: select * from rif_studies.s534_extract 
JDBC Saving extract frame to: scratchSpace/d501-600/s534/data/tmp_s534_extract.csv
JDBC rif_studies.s534_extract numberOfRows=39160==
About to calculate band data
Covariates: none
About to run homogeneity tests
Stack tracer >>>

 performRiskAnal.R#506: .handleSimpleError(function (obj) 
{
    calls = sys 
performRiskAnal.R#506: unique(Bands$band_id) 
Statistics_JRI.R#304: performHomogAnal(resultBands) 
Statistics_Common.R#175: withVisible(expr) 
Statistics_Common.R#175: withCallingHandlers(withVisible(expr), error = err withErrorTracing({
    cat(paste0("About to fetch extract table", "\n"))
   doTryCatch(return(expr), name, parentenv, handler) tryCatchOne(expr, names, parentenv, handlers[[1]]) tryCatchList(expr, names[-nh], parentenv, handlers[-nh]) doTryCatch(return(expr), name, parentenv, handler) tryCatchOne(tryCatchList(expr, names[-nh], parentenv, handlers[-nh]), names tryCatchList(expr, classes, parentenv, handlers) tryCatch({
    withErrorTracing({
        cat(paste0("About to fetch extrac eval(expr, pf) eval(expr, pf) withVisible(eval(expr, pf)) evalVis(expr) Statistics_JRI.R#278: capture.output({
    tryCatch({
        withErrorTrac runRRiskAnalFunctions() 
<<< End of stack tracer.
callRiskAnal() ERROR:  $ operator is invalid for atomic vectors ; call stack:   
callRiskAnal exitValue: 1

<<< End of Statistics_JRI.R errorTrace.
devilgate commented 5 years ago

Oh, I had that at one point. That symptom, at least. Let me see if I can find what the cause was... no, sorry, I remember that error message, but I don't seem to have a note of what caused it or fixed it.

devilgate commented 5 years ago

Gaahh!!! I just merged, and now your ancient version of CommonRService is in my branch!!!

Never mind, I'll sort it out.

Wait, I misread it, you just copied the old comments over.

devilgate commented 5 years ago

Hi Peter, would you take another look at this, please? Everything is working now, as far as I can tell. Changing the URL in the original commit broke the file download feature, and digging in to fix that opened another can of worms.

Among other things, adding the support for different taxonomies seems to have changed the possibilities of the JSON that can be generated, so I've had to work around that, especially in GetStudyJSON.java and RIFTaxonomyWebServiceResource.java; but also in rifc-dsub-params.js. You might want to take a particular look at the latter.

@peterhambly

peterhambly commented 5 years ago

The changes to rifc-dsub-params.js and the associated REST message have broken the selection of ICD codes: all appear to be "selected" although only one is. Only one can be selected. Trying to submit gets: Error: Record "Health code" field "Code" cannot be empty. The code to suppress chapter headings is still working but a mysterious search term of "'#" for "NEOPLASMS" has appeared that will probably break study submission.

I am getting rifc-dsub-params.js Javascripts errors in from Angular, so I suspect this is where the error is.

Error: res.data.terms is null
handleTextSearch@http://localhost:8080/RIF40/dashboards/submission/controllers/rifc-dsub-params.js:332:25
e/<@http://localhost:8080/RIF40/libs/standalone/angular.min.js:131:20
$eval@http://localhost:8080/RIF40/libs/standalone/angular.min.js:145:343
$digest@http://localhost:8080/RIF40/libs/standalone/angular.min.js:142:412
$apply@http://localhost:8080/RIF40/libs/standalone/angular.min.js:146:111
l@http://localhost:8080/RIF40/libs/standalone/angular.min.js:97:320
J@http://localhost:8080/RIF40/libs/standalone/angular.min.js:102:34
gg/</t.onload@http://localhost:8080/RIF40/libs/standalone/angular.min.js:103:4

I have still got: ERROR: $ operator is invalid for atomic vectors ; call stack: if I run a previously saved study.

devilgate commented 5 years ago

OK, I see what you mean about selection. Clearly I didn't click on a disease.

Where do you go to see those Angular messages? I don't see them in either the FrontEndLogger file or in the Safari JavaScript Console.

The other thing is from R, isn't it? Brandon's just having a look at the Risk Analysis functionality at the moment.

devilgate commented 5 years ago

And I don't understand what you mean about the mysterious search term. Can you add a screenshot?

devilgate commented 5 years ago

I've just pushed a commit that should fix everything going green. I had an @XmlElement annotation on namespace instead of identifier, so every result was showing as matching.

peterhambly commented 5 years ago

All fine now apart from the R fault that is with Brandon: $ operator is invalid for atomic vectors. I added a trap in the front end to avoid an exception if the data is not returned as expected.

devilgate commented 5 years ago

I'd like to get this branch merged into master. The outstanding error is to do with Risk Analysis, not the taxonomy changes, and it doesn't break anything that's working at the moment.