opencb / opencga

An Open Computational Genomics Analysis platform for big data genomics analysis. OpenCGA is maintained and develop by its parent company Zetta Genomics. Please contact support@zettagenomics.com for bug report and feature requests.
Apache License 2.0
166 stars 97 forks source link

HBase connection issue Gen2 Datalake #1069

Closed lawrencegripper closed 5 years ago

lawrencegripper commented 5 years ago

Currently, when connecting to HBase the following error is observed:

Caused by: java.lang.ExceptionInInitializerError
        at org.apache.hadoop.hbase.ClusterId.parseFrom(ClusterId.java:64)
        at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:75)
        at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:105)
        at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:889)
        at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.<init>(ConnectionManager.java:645)
        ... 69 more
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem not found
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2228)
        at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2780)
        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2793)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)
        at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2829)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2811)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:390)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:179)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:374)
        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
        at org.apache.hadoop.hbase.util.DynamicClassLoader.initTempDir(DynamicClassLoader.java:118)
        at org.apache.hadoop.hbase.util.DynamicClassLoader.<init>(DynamicClassLoader.java:98)
        at org.apache.hadoop.hbase.protobuf.ProtobufUtil.<clinit>(ProtobufUtil.java:250)
        ... 74 more
Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem not found
        at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2134)
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2226)
        ... 86 more

On the second attempt to connect to Hbase a similar error occurs but listing a missing protobuf class

lawrencegripper commented 5 years ago

By updating the pom to reference a version of the hadoop jar which we know to contain the azureBlobFilesystem class (done here] this error changed to be the following:

WARN  Configuration:2731 - hbase-site.xml:an attempt to override final parameter: dfs.support.append;  Ignoring.
2019-01-24 12:49:44 [http-nio-8080-exec-6] ERROR OpenCGAWSServer:447 - Catch error: Problems opening connection to DB
java.lang.IllegalStateException: Problems opening connection to DB
    at org.opencb.opencga.storage.hadoop.utils.HBaseManager.getConnection(HBaseManager.java:131)
    at org.opencb.opencga.storage.hadoop.utils.HBaseManager.<init>(HBaseManager.java:79)
    at org.opencb.opencga.storage.hadoop.variant.metadata.AbstractHBaseDBAdaptor.<init>(AbstractHBaseDBAdaptor.java:48)
    at org.opencb.opencga.storage.hadoop.variant.metadata.HBaseProjectMetadataDBAdaptor.<init>(HBaseProjectMetadataDBAdaptor.java:46)
    at org.opencb.opencga.storage.hadoop.variant.metadata.HBaseVariantStorageMetadataDBAdaptorFactory.buildProjectMetadataDBAdaptor(HBaseVariantStorageMetadataDBAdaptorFactory.java:38)
    at org.opencb.opencga.storage.hadoop.variant.metadata.HBaseVariantStorageMetadataDBAdaptorFactory.buildProjectMetadataDBAdaptor(HBaseVariantStorageMetadataDBAdaptorFactory.java:13)
    at org.opencb.opencga.storage.core.metadata.StudyConfigurationManager.<init>(StudyConfigurationManager.java:76)
    at org.opencb.opencga.storage.hadoop.variant.HadoopVariantStorageEngine.getStudyConfigurationManager(HadoopVariantStorageEngine.java:923)
    at org.opencb.opencga.storage.core.manager.variant.VariantStorageManager.secure(VariantStorageManager.java:512)
    at org.opencb.opencga.storage.core.manager.variant.VariantStorageManager.get(VariantStorageManager.java:352)
    at org.opencb.opencga.server.rest.analysis.VariantAnalysisWSService.getVariants(VariantAnalysisWSService.java:350)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(ResourceMethodInvocationHandlerFactory.java:81)
    at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:144)
    at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:161)
    at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$ResponseOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:160)
    at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:99)
    at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:389)
    at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:347)
    at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:102)
    at org.glassfish.jersey.server.ServerRuntime$2.run(ServerRuntime.java:326)
    at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271)
    at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:267)
    at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:317)
    at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:305)
    at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1154)
    at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:473)
    at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:427)
    at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:388)
    at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:341)
    at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:228)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
    at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
    at org.opencb.opencga.server.CORSFilter.doFilter(CORSFilter.java:46)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
    at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:198)
    at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96)
    at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:493)
    at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:140)
    at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:81)
    at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:650)
    at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:87)
    at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:342)
    at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:800)
    at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66)
    at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:806)
    at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1498)
    at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: java.lang.reflect.InvocationTargetException
    at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:240)
    at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:218)
    at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:119)
    at org.opencb.opencga.storage.hadoop.utils.HBaseManager.getConnection(HBaseManager.java:125)
    ... 61 more
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238)
    ... 64 more
Caused by: java.lang.NoClassDefFoundError: org/apache/hadoop/fs/StreamCapabilities
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
    at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
    at org.apache.catalina.loader.WebappClassLoaderBase.findClassInternal(WebappClassLoaderBase.java:2356)
    at org.apache.catalina.loader.WebappClassLoaderBase.findClass(WebappClassLoaderBase.java:830)
    at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1297)
    at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1156)
    at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.initialize(AzureBlobFileSystem.java:103)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2795)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)
    at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2829)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2811)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:390)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:179)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:374)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
    at org.apache.hadoop.hbase.util.DynamicClassLoader.initTempDir(DynamicClassLoader.java:118)
    at org.apache.hadoop.hbase.util.DynamicClassLoader.<init>(DynamicClassLoader.java:98)
    at org.apache.hadoop.hbase.protobuf.ProtobufUtil.<clinit>(ProtobufUtil.java:250)
    at org.apache.hadoop.hbase.ClusterId.parseFrom(ClusterId.java:64)
    at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:75)
    at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:105)
    at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:889)
    at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.<init>(ConnectionManager.java:645)
    ... 69 more
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.StreamCapabilities
    at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1328)
    at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1156)
    ... 93 more

My current thinking is that the version used for all of the hadoop dependencies needs be updated.

Currently the way we're identifying which version of the hortonworks jars contain the class we need is with the following:

~/source/hacks/opencga-azure/arm [lg/azure-fitness L|✚ 1] 
17:24 $ jar -tf hadoop-azure-3.1.1.3.0.2.0-50.jar | grep AzureBlob
org/apache/hadoop/fs/azurebfs/contracts/exceptions/AzureBlobFileSystemException.class
org/apache/hadoop/fs/azurebfs/AzureBlobFileSystem$1.class
org/apache/hadoop/fs/azurebfs/AzureBlobFileSystem$2.class
org/apache/hadoop/fs/azurebfs/AzureBlobFileSystem$FileSystemOperation.class
org/apache/hadoop/fs/azurebfs/AzureBlobFileSystem.class
org/apache/hadoop/fs/azurebfs/AzureBlobFileSystemStore$VersionedFileStatus.class
org/apache/hadoop/fs/azurebfs/AzureBlobFileSystemStore.class
org/apache/hadoop/fs/azurebfs/SecureAzureBlobFileSystem.class
lawrencegripper commented 5 years ago

I'm using jar -tf ./build/opencga-storage-hadoop-core-1.4.0-rc3-dev-jar-with-dependencies.jar | grep AzureBlob after a build has completed to see if the class is included.

While the change in the PR did result in this command returning correctly I think it's broken as I've only updated this hadoop-azure.jar and not the surrounding dependencies.

When I attempt to do this by seeing the hadoop.version property in the storage pom I no longer see the jar include the azureBlobkFileSystem class

 <properties>
                <module-opencga-storage-hadoop-deps>true</module-opencga-storage-hadoop-deps>
                <!--Version values for the default profile hdp-2.5.0-->
                <opencga-storage-hadoop-deps.classifier>hdp-2.5.6</opencga-storage-hadoop-deps.classifier>
                <hdp.dependencies.version>2.5.6.13-3</hdp.dependencies.version>

                <hadoop.version>3.1.1.3.0.2.0-50</hadoop.version>
                <hbase.version>1.1.2.${hdp.dependencies.version}</hbase.version>
                <phoenix.version>5.0.0.3.0.2.0-50</phoenix.version>
                <tephra.version>0.7.0</tephra.version>

However, at least I have a nice feedback loop now to test different pom settings to see which will include the files needed.

I think the is probably down to me mis-understanding how the POM is working

marrobi commented 5 years ago

Where is ${hadoop.version} defined for the other dependencies?

I see it's hard coded here: https://github.com/opencb/opencga/blob/d39ec730520f6ba9bf153e078221fe34a6bd2ac6/opencga-storage/opencga-storage-hadoop/opencga-storage-hadoop-deps/pom.xml#L104 , but not https://github.com/opencb/opencga/blob/d39ec730520f6ba9bf153e078221fe34a6bd2ac6/opencga-storage/opencga-storage-hadoop/opencga-storage-hadoop-deps/pom.xml#L92 .

marrobi commented 5 years ago

So from http://central.maven.org/maven2/org/apache/hadoop/:

➜  tmp jar -tf hadoop-azure-3.1.1.jar | grep AzureBlob
NOTHING
➜  tmp jar -tf hadoop-azure-3.2.0.jar | grep AzureBlob
org/apache/hadoop/fs/azurebfs/contracts/exceptions/AzureBlobFileSystemException.class
org/apache/hadoop/fs/azurebfs/SecureAzureBlobFileSystem.class
org/apache/hadoop/fs/azurebfs/AzureBlobFileSystem$1.class
org/apache/hadoop/fs/azurebfs/AzureBlobFileSystem$2.class
org/apache/hadoop/fs/azurebfs/AzureBlobFileSystem$FileSystemOperation.class
org/apache/hadoop/fs/azurebfs/AzureBlobFileSystem.class
org/apache/hadoop/fs/azurebfs/AzureBlobFileSystemStore$VersionedFileStatus.class
org/apache/hadoop/fs/azurebfs/AzureBlobFileSystemStore.class
➜  tmp jar -tf hadoop-common-3.2.0.jar | grep StreamCapabi
org/apache/hadoop/fs/StreamCapabilities$StreamCapability.class
org/apache/hadoop/fs/StreamCapabilities.class
org/apache/hadoop/fs/StreamCapabilitiesPolicy.class
lawrencegripper commented 5 years ago

So on the cluster I've found a version of the hadoop-azure.jar here /usr/hdp/current/hadoop-client

This contains the AzureBlobFilesystem class (hadoop-azure-2.7.3.2.6.5.3005-27.jar)

sshuser@hn0-cgahba:/usr/hdp/current/hadoop-client$ jar -tf hadoop-azure-2.7.3.2.6.5.3005-27.jar | grep AzureBlob
org/apache/hadoop/fs/azurebfs/AzureBlobFileSystem$2.class
org/apache/hadoop/fs/azurebfs/AzureBlobFileSystem.class
org/apache/hadoop/fs/azurebfs/AzureBlobFileSystemStore$VersionedFileStatus.class
org/apache/hadoop/fs/azurebfs/AzureBlobFileSystemStore.class
org/apache/hadoop/fs/azurebfs/AzureBlobFileSystem$1.class
org/apache/hadoop/fs/azurebfs/AzureBlobFileSystem$FileSystemOperation.class
org/apache/hadoop/fs/azurebfs/SecureAzureBlobFileSystem.class
org/apache/hadoop/fs/azurebfs/contracts/exceptions/AzureBlobFileSystemException.class
lawrencegripper commented 5 years ago

To try and force this to be used I'm then running sudo docker run -i -t -p 8080:8080 -p 8443:8443 -e BASEDIR=/opt/opencga -e CLASSPATH=:/op/opt/opencga/conf/hadoop/hadoop-azure.jar --mount type=bind,src=/media/primarynfs/conf,dst=/opt/opencga/conf,readonly --mount type=bind,src=/media/primarynfs/sessions,dst=/opt/opencga/sessions opencb/opencga-app:7ee641fb4fb4c594c87d2578720e4d7d15d53215

After scping the jar from the cluster like so

sshpass -p 'passwordhere' scp -o StrictHostKeyChecking=no  -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -r "sshuser@nodehere":'/usr/hdp/current/hadoop-client/*.jar' /media/primarynfs/conf/hadoop/

I'm unsure which of these is picked up as the opencga-env.sh file sets the following:

LANG=C.UTF-8
BASEDIR=/opt/opencga
HOSTNAME=8ea9d06606a7
CLASSPATH_PREFIX=:/opt/opencga/conf/hadoop/
JAVA_OPTS= -Dlog4j.configuration=file:/opt/opencga/conf/log4j.properties
JAVA_HOME=/usr/lib/jvm/java-1.8-openjdk/jre
JAVA_VERSION=8u191
PWD=/opt/opencga
HADOOP_CLASSPATH=:/opt/opencga/libs/protobuf-java-util-3.5.1.jar:/opt/opencga/libs/protobuf-java-3.5.1.jar::/opt/opencga/libs/avro-mapred-1.7.7-hadoop2.jar:/opt/opencga/libs/avro-1.7.7.jar:/opt/opencga/libs/avro-ipc-1.7.7-tests.jar:/opt/opencga/libs/avro-ipc-1.7.7.jar::/opt/opencga/libs/jackson-jaxrs-json-provider-2.9.7.jar:/opt/opencga/libs/jackson-dataformat-cbor-2.6.3.jar:/opt/opencga/libs/jackson-dataformat-yaml-2.9.7.jar:/opt/opencga/libs/jackson-core-2.9.7.jar:/opt/opencga/libs/jackson-module-jaxb-annotations-2.9.7.jar:/opt/opencga/libs/jackson-datatype-joda-2.9.4.jar:/opt/opencga/libs/jackson-databind-2.9.7.jar:/opt/opencga/libs/jackson-dataformat-xml-2.9.7.jar:/opt/opencga/libs/jackson-annotations-2.9.7.jar:/opt/opencga/libs/jackson-jaxrs-base-2.9.7.jar:::/opt/opencga/conf/hadoop/
HOME=/home/opencga
HADOOP_USER_CLASSPATH_FIRST=true
TERM=xterm
SHLVL=1
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/lib/jvm/java-1.8-openjdk/jre/bin:/usr/lib/jvm/java-1.8-openjdk/bin
JAVA_ALPINE_VERSION=8.191.12-r0
_=/bin/printenv
jjcollinge commented 5 years ago

Another thought to try would be to modify the built jar to drop in the hadoop-azure.jar in the correct path inside the opencga.jar org/apache/hadoop/fs/. A jar is just a .zip so rename it, inflate it, extract the hadoop-azure.jar into that path and then repack it all up. I think this way would mean we wouldn't need to do any shading and might not conflict with the opencga-env.sh that sets the CLASSPATH

EDIT looks like CLASSPATH_PREFIX can be used to append deps.

lawrencegripper commented 5 years ago

By having hadoop-azure.jar in the /opt/opencga/conf/hadoop folder and running:

sudo docker run -i -t -p 8080:8080 -p 8443:8443 -e BASEDIR=/opt/opencga --mount type=bind,src=/media/primarynfs/conf,dst=/opt/opencga/conf,readonly    --mount type=bind,src=/media/primarynfs/sessions,dst=/opt/opencga/sessions lawrencegripppencga-app:0.3 "/opt/tomcat/bin/catalina.sh run"

Note: this is without the opencga-env.sh running

I now get the stream error:

java.lang.IllegalStateException: Problems opening connection to DB
    at org.opencb.opencga.storage.hadoop.utils.HBaseManager.getConnection(HBaseManager.java:131)
    at org.opencb.opencga.storage.hadoop.utils.HBaseManager.<init>(HBaseManager.java:79)
    at org.opencb.opencga.storage.hadoop.variant.metadata.AbstractHBaseDBAdaptor.<init>(AbstractHBaseDBAdaptor.java:48)
    at org.opencb.opencga.storage.hadoop.variant.metadata.HBaseProjectMetadataDBAdaptor.<init>(HBaseProjectMetadataDBAdaptor.java:46)
    at org.opencb.opencga.storage.hadoop.variant.metadata.HBaseVariantStorageMetadataDBAdaptorFactory.buildProjectMetadataDBAdaptor(HBaseVariantStorageMetadataDBAdaptorFactory.java:38)
    at org.opencb.opencga.storage.hadoop.variant.metadata.HBaseVariantStorageMetadataDBAdaptorFactory.buildProjectMetadataDBAdaptor(HBaseVariantStorageMetadataDBAdaptorFactory.java:13)
    at org.opencb.opencga.storage.core.metadata.StudyConfigurationManager.<init>(StudyConfigurationManager.java:76)
    at org.opencb.opencga.storage.hadoop.variant.HadoopVariantStorageEngine.getStudyConfigurationManager(HadoopVariantStorageEngine.java:923)
    at org.opencb.opencga.storage.core.manager.variant.VariantStorageManager.secure(VariantStorageManager.java:512)
    at org.opencb.opencga.storage.core.manager.variant.VariantStorageManager.get(VariantStorageManager.java:352)
    at org.opencb.opencga.server.rest.analysis.VariantAnalysisWSService.getVariants(VariantAnalysisWSService.java:350)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(ResourceMethodInvocationHandlerFactory.java:81)
    at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:144)
    at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:161)
    at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$ResponseOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:160)
    at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:99)
    at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:389)
    at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:347)
    at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:102)
    at org.glassfish.jersey.server.ServerRuntime$2.run(ServerRuntime.java:326)
    at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271)
    at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:267)
    at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:317)
    at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:305)
    at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1154)
    at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:473)
    at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:427)
    at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:388)
    at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:341)
    at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:228)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
    at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
    at org.opencb.opencga.server.CORSFilter.doFilter(CORSFilter.java:46)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
    at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:198)
    at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96)
    at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:493)
    at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:140)
    at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:81)
    at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:650)
    at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:87)
    at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:342)
    at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:800)
    at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66)
    at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:806)
    at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1498)
    at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: java.lang.reflect.InvocationTargetException
    at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:240)
    at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:218)
    at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:119)
    at org.opencb.opencga.storage.hadoop.utils.HBaseManager.getConnection(HBaseManager.java:125)
    ... 61 more
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238)
    ... 64 more
Caused by: java.lang.NoClassDefFoundError: org/apache/hadoop/fs/StreamCapabilities
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
    at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
    at org.apache.catalina.loader.WebappClassLoaderBase.findClassInternal(WebappClassLoaderBase.java:2356)
    at org.apache.catalina.loader.WebappClassLoaderBase.findClass(WebappClassLoaderBase.java:830)
    at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1297)
    at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1156)
    at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.initialize(AzureBlobFileSystem.java:103)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2795)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)
    at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2829)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2811)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:390)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:179)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:374)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
    at org.apache.hadoop.hbase.util.DynamicClassLoader.initTempDir(DynamicClassLoader.java:118)
    at org.apache.hadoop.hbase.util.DynamicClassLoader.<init>(DynamicClassLoader.java:98)
    at org.apache.hadoop.hbase.protobuf.ProtobufUtil.<clinit>(ProtobufUtil.java:250)
    at org.apache.hadoop.hbase.ClusterId.parseFrom(ClusterId.java:64)
    at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:75)
    at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:105)
    at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:889)
    at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.<init>(ConnectionManager.java:645)
    ... 69 more
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.StreamCapabilities
    at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1328)
    at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1156)
    ... 93 more

This suggests that WITHOUT the opencga-env.sh script copying the azure jar into the conf/hadoop things behave up until we get the stream error.

jjcollinge commented 5 years ago

Ok I think I got this working - atleast running, I still need to check Fitnesse tests.

I used the following command to run the container:

sudo docker run -i -t -p 8080:8080 -p 8443:8443 -e BASEDIR=/opt/opencga -e CLASSPATH_PREFIX=:/opt/opencga/conf/hadoop/hadoop-azure.jar:/opt/opencga/conf/hadoop/hadoop-common.jar:/opt/opencga/conf/hadoop/hadoop-azure-datalake.jar --mount type=bind,src=/media/primarynfs/conf,dst=/opt/opencga/conf,readonly --mount type=bind,src=/media/primarynfs/sessions,dst=/opt/opencga/sessions opencb/opencga-app:7ee641fb4fb4c594c87d2578720e4d7d15d53215

I don't think we need the datalake jar but I threw it in for good measure - so we can likely remove it.

This then led to this error: java.lang.ClassNotFoundException: com.microsoft.log4jappender.EtwAppender

I initially went and got that .jar too and loaded it but there were more errors related to ETW, then I realised we didn't even need those log appenders so I edited the log4j.properties in the Hadoop conf we grab from the server to remove any ETW/Filter appender references (diff below).

After doing this I started the server, it setup correctly and served the usual pages on http://localhost:8008/opencga.

diff

lawrencegripper commented 5 years ago

Great work, I’ll test on my environment shortly.

Assuming all works I guess the next steps would be to update ‘opencga-env.sh’ to load these jars into the CLASSPATH and update the init container to pull down the jars and edit the log4j config.

Sound about right?

jjcollinge commented 5 years ago

yup that's the plan! Will catch up with you after the standup to run through this

jjcollinge commented 5 years ago

Atleast for startup removing hadoop-azure-datalake doesn't appear to make a difference - we'd need to validate with the fitnesse tests. Also I've confirmed that you do need to remove both the ETW and FilterLog appender from the log4j properties.

Some references below:

NOTE: All jars available at /usr/hdp/2.6.5.3005-27/oozie/oozie-server/webapps/oozie/WEB-INF/lib

lawrencegripper commented 5 years ago

So running tomcat with the following still produced the error for azure blob

sudo docker run -i -t -p 8080:8080 -p 8443:8443 -e BASEDIR=/opt/opencga -e CLASSPATH_PREFIX=:/opt/opencga/conf/hadoop/hadoop-azure.jar:/opt/opencga/conf/hadoop/hadoop-common.jar:/opt/opencga/conf/hadoop/hadoop-azure-datalake.jar --mount type=bind,src=/media/primarynfs/conf,dst=/opt/opencga/conf,readonly --mount type=bind,src=/media/primarynfs/sessions,dst=/opt/opencga/sessions opencb/opencga-app:7ee641fb4fb4c594c87d2578720e4d7d15d53215

This results in the env being setup like this:

opencgaadmin@webserversl7n2ed000003:~$ sudo docker run -i -t -p 8080:8080 -p 8443:8443 -e BASEDIR=/opt/opencga -e CLASSPATH_PREFIX=:/opt/opencga/conf/hadoop/hadoop-azure.jar:/opt/opencga/conf/hadoop/hadoop-common.jar:/opt/opencga/conf/hadoop/hadoop-azure-datalake.jar:/opt/opencga/conf/hadoop/ --mount type=bind,src=/media/primarynfs/conf,dst=/opt/opencga/conf,readonly --mount type=bind,src=/media/primarynfs/sessions,dst=/opt/opencga/sessions opencb/opencga-app:7ee641fb4fb4c594c87d2578720e4d7d15d53215 ". /opt/opencga/conf/opencga-env.sh && printenv"
LANG=C.UTF-8
BASEDIR=/opt/opencga
HOSTNAME=f2dd6b3c322a
CLASSPATH_PREFIX=:/opt/opencga/conf/hadoop/hadoop-azure.jar:/opt/opencga/conf/hadoop/hadoop-common.jar:/opt/opencga/conf/hadoop/hadoop-azure-datalake.jar:/opt/opencga/conf/hadoop/:/opt/opencga/conf/hadoop/
JAVA_OPTS= -Dlog4j.configuration=file:/opt/opencga/conf/log4j.properties
JAVA_HOME=/usr/lib/jvm/java-1.8-openjdk/jre
JAVA_VERSION=8u191
PWD=/opt/opencga
HADOOP_CLASSPATH=:/opt/opencga/libs/protobuf-java-util-3.5.1.jar:/opt/opencga/libs/protobuf-java-3.5.1.jar::/opt/opencga/libs/avro-mapred-1.7.7-hadoop2.jar:/opt/opencga/libs/avro-1.7.7.jar:/opt/opencga/libs/avro-ipc-1.7.7-tests.jar:/opt/opencga/libs/avro-ipc-1.7.7.jar::/opt/opencga/libs/jackson-jaxrs-json-provider-2.9.7.jar:/opt/opencga/libs/jackson-dataformat-cbor-2.6.3.jar:/opt/opencga/libs/jackson-dataformat-yaml-2.9.7.jar:/opt/opencga/libs/jackson-core-2.9.7.jar:/opt/opencga/libs/jackson-module-jaxb-annotations-2.9.7.jar:/opt/opencga/libs/jackson-datatype-joda-2.9.4.jar:/opt/opencga/libs/jackson-databind-2.9.7.jar:/opt/opencga/libs/jackson-dataformat-xml-2.9.7.jar:/opt/opencga/libs/jackson-annotations-2.9.7.jar:/opt/opencga/libs/jackson-jaxrs-base-2.9.7.jar:::/opt/opencga/conf/hadoop/hadoop-azure.jar:/opt/opencga/conf/hadoop/hadoop-common.jar:/opt/opencga/conf/hadoop/hadoop-azure-datalake.jar:/opt/opencga/conf/hadoop/:/opt/opencga/conf/hadoop/
HOME=/home/opencga
HADOOP_USER_CLASSPATH_FIRST=true
TERM=xterm
SHLVL=1
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/lib/jvm/java-1.8-openjdk/jre/bin:/usr/lib/jvm/java-1.8-openjdk/bin
JAVA_ALPINE_VERSION=8.191.12-r0
_=/bin/printenv

And the same error being produced

lawrencegripper commented 5 years ago

So I have a new error now. Having read about how tomcat finds stuff I realized that classpath won't have the effect needed blog post here

Given that I then edited catalina.properties: /opt/tomcat/conf/catalina.properties and added

common.loader="${catalina.base}/lib","${catalina.base}/lib/*.jar","${catalina.home}/lib","${catalina.home}/lib/*.jar","/opt/opencga/conf/hadoop/*.jar"

The new error I get is the following, this suggests that I've successfully forced loading of the azure jars from the cluster which are copied into /opt/opencga/conf/hadoop but there is a version mismatch somewhere

[Edited copy paste error]

Caused by: java.lang.ClassCastException: org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem cannot be cast to org.apache.hadoop.fs.FileSystem
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2794)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)
    at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2829)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2811)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:390)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:179)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:374)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
    at org.apache.hadoop.hbase.util.DynamicClassLoader.initTempDir(DynamicClassLoader.java:118)
    at org.apache.hadoop.hbase.util.DynamicClassLoader.<init>(DynamicClassLoader.java:98)
    at org.apache.hadoop.hbase.protobuf.ProtobufUtil.<clinit>(ProtobufUtil.java:250)
    ... 74 more
jjcollinge commented 5 years ago

Here's an attempt that didn't work.

Extract the opencga.war, extract WEB-INF/lib/opencga-storage-hadoop-deps-1.4.0-rc3-dev-hdp-2.6.0-shaded.jar, add azurebfs from HDI .jar into path org/apache/hadoop/fs/azurebfs, repackage .jar and .war then docker cp back into /opt/opencga of opencga-app container overwritting the existing one then start tomcat with . /opt/opencga/conf/opencga-env.sh && /opt/tomcat/bin/catalina.sh run. I've also tried to put the .jar in the .war WEB-INF/lib directory directly and in the /opt/opencga/libs directory with no luck.

lawrencegripper commented 5 years ago

Retried building without the -pStorage-hadoop-deps in an attempt to remove any existing versions of hadoop jars included in the build to ensure we used the ones copied from the cluster.

This failed again with java.lang.ClassCastException: org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem cannot be cast to org.apache.hadoop.fs.FileSystem ... oddly look in the build output it seemed to still output the shaded storage deps jar so the mvn command might not have worked as expected. Will retry Monday.

mvn clean install -DskipTests -Dstorage-hadoop -Phdp-2.6.0 -DOPENCGA.STORAGE.DEFAULT_ENGINE=hadoop -Dopencga.war.name=opencga -Dcheckstyle.skip

PS. Destorying my environment to reduce costs over the weekend so will be starting fresh.

marrobi commented 5 years ago

So I added the jar files to maven local repository:

mvn install:install-file -Dfile=./azure-deps/hadoop-azure-2.7.3.2.6.5.3005-27.jar -DgroupId=org.apache.hadoop -DartifactId=hadoop-azure -Dversion=2.7.3.2.6.5.3005-27 -Dpackaging=jar
mvn install:install-file -Dfile=./azure-deps/hadoop-common-2.7.3.2.6.5.3005-27.jar -DgroupId=org.apache.hadoop -DartifactId=hadoop-common -Dversion=2.7.3.2.6.5.3005-27 -Dpackaging=jar

Updated the pom to reference the specific versions:

   <!--Hadoop dependencies-->
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-common</artifactId>
            <version>2.7.3.2.6.5.3005-27</version>
            <optional>true</optional>
        </dependency>

        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-client</artifactId>
            <version>${hadoop.version}</version>
            <optional>true</optional>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-azure</artifactId>
            <version>2.7.3.2.6.5.3005-27</version>
            <optional>true</optional>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-azure-datalake</artifactId>
            <version>${hadoop.version}</version>
            <optional>true</optional>
        </dependency>

Fitnesse Query Varient gives:

{"apiVersion":"v1","time":2086,"warning":"","error":"Study { name: \"qwfrd@bpuzh:szwwd\" } not found.","queryOptions":{"metadata":true,"skipCount":true,"limit":2000},"response":[{"id":"","dbTime":-1,"numResults":-1,"numTotalResults":-1,"warningMsg":"Future errors will ONLY be shown in the QueryResponse body","errorMsg":"DEPRECATED: org.opencb.opencga.storage.core.variant.adaptors.VariantQueryException: Study { name: \"qwfrd@bpuzh:szwwd\" } not found.","resultType":"","result":[]}]}

That right isn't it?!

Output in Docker logs:

2019-01-25 17:25:15 [http-nio-8080-exec-9-SendThread(10.0.0.8:2181)] INFO  ClientCnxn:1235 - Session establishment complete on server 10.0.0.8/10.0.0.8:2181, sessionid = 0x368847f2d6000bf, negotiated timeout = 120000
2019-01-25 17:25:16 [http-nio-8080-exec-9] WARN  Configuration:2825 - hbase-site.xml:an attempt to override final parameter: dfs.support.append;  Ignoring.
2019-01-25 17:25:16 [http-nio-8080-exec-9] WARN  DynamicClassLoader:120 - Failed to identify the fs of dir /hbase/lib, ignored
Cannot run program "/usr/lib/hdinsight-common/scripts/decrypt.sh": error=2, No such file or directory
        at org.apache.hadoop.fs.azurebfs.services.ShellDecryptionKeyProvider.getStorageAccountKey(ShellDecryptionKeyProvider.java:65)
        at org.apache.hadoop.fs.azurebfs.AbfsConfiguration.getStorageAccountKey(AbfsConfiguration.java:355)
        at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.initializeClient(AzureBlobFileSystemStore.java:880)
        at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.<init>(AzureBlobFileSystemStore.java:181)
        at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.initialize(AzureBlobFileSystem.java:108)
        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2796)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)
        at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2830)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2812)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:390)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:179)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:374)
        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
        at org.apache.hadoop.hbase.util.DynamicClassLoader.initTempDir(DynamicClassLoader.java:118)
        at org.apache.hadoop.hbase.util.DynamicClassLoader.<init>(DynamicClassLoader.java:98)
        at org.apache.hadoop.hbase.protobuf.ProtobufUtil.<clinit>(ProtobufUtil.java:250)
        at org.apache.hadoop.hbase.ClusterId.parseFrom(ClusterId.java:64)
        at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:75)
        at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:105)
        at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:889)
        at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.<init>(ConnectionManager.java:645)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238)
        at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:218)
        at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:119)
        at org.opencb.opencga.storage.hadoop.utils.HBaseManager.getConnection(HBaseManager.java:125)
        at org.opencb.opencga.storage.hadoop.utils.HBaseManager.<init>(HBaseManager.java:79)
        at org.opencb.opencga.storage.hadoop.variant.metadata.AbstractHBaseDBAdaptor.<init>(AbstractHBaseDBAdaptor.java:48)
        at org.opencb.opencga.storage.hadoop.variant.metadata.HBaseProjectMetadataDBAdaptor.<init>(HBaseProjectMetadataDBAdaptor.java:46)
        at org.opencb.opencga.storage.hadoop.variant.metadata.HBaseVariantStorageMetadataDBAdaptorFactory.buildProjectMetadataDBAdaptor(HBaseVariantStorageMetadataDBAdaptorFactory.java:38)
        at org.opencb.opencga.storage.hadoop.variant.metadata.HBaseVariantStorageMetadataDBAdaptorFactory.buildProjectMetadataDBAdaptor(HBaseVariantStorageMetadataDBAdaptorFactory.java:13)
        at org.opencb.opencga.storage.core.metadata.StudyConfigurationManager.<init>(StudyConfigurationManager.java:76)
        at org.opencb.opencga.storage.hadoop.variant.HadoopVariantStorageEngine.getStudyConfigurationManager(HadoopVariantStorageEngine.java:923)
        at org.opencb.opencga.storage.core.manager.variant.VariantStorageManager.secure(VariantStorageManager.java:512)
        at org.opencb.opencga.storage.core.manager.variant.VariantStorageManager.get(VariantStorageManager.java:352)
        at org.opencb.opencga.server.rest.analysis.VariantAnalysisWSService.getVariants(VariantAnalysisWSService.java:350)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(ResourceMethodInvocationHandlerFactory.java:81)
        at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:144)
        at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:161)
        at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$ResponseOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:160)
        at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:99)
        at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:389)
        at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:347)
        at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:102)
        at org.glassfish.jersey.server.ServerRuntime$2.run(ServerRuntime.java:326)
        at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271)
        at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267)
        at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
        at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
        at org.glassfish.jersey.internal.Errors.process(Errors.java:267)
        at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:317)
        at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:305)
        at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1154)
        at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:473)
        at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:427)
        at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:388)
        at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:341)
        at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:228)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.opencb.opencga.server.CORSFilter.doFilter(CORSFilter.java:46)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:198)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96)
        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:493)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:140)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:81)
        at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:650)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:87)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:342)
        at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:800)
        at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66)
        at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:806)
        at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1498)
        at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
        at java.lang.Thread.run(Thread.java:748)
2019-01-25 17:25:16 [http-nio-8080-exec-9] WARN  Configuration:2825 - hbase-site.xml:an attempt to override final parameter: dfs.support.append;  Ignoring.
2019-01-25 17:25:16 [http-nio-8080-exec-9] WARN  Configuration:2825 - hbase-site.xml:an attempt to override final parameter: dfs.support.append;  Ignoring.
2019-01-25 17:25:16 [http-nio-8080-exec-9] INFO  HBaseManager:129 - Opened Hadoop DB connection hconnection-0x280505f6 called from [java.lang.Thread.getStackTrace(Thread.java:1559), org.opencb.opencga.storage.hadoop.utils.HBaseManager.getConnection(HBaseManager.java:128), org.opencb.opencga.storage.hadoop.utils.HBaseManager.<init>(HBaseManager.java:79), org.opencb.opencga.storage.hadoop.variant.metadata.AbstractHBaseDBAdaptor.<init>(AbstractHBaseDBAdaptor.java:48), org.opencb.opencga.storage.hadoop.variant.metadata.HBaseProjectMetadataDBAdaptor.<init>(HBaseProjectMetadataDBAdaptor.java:46), org.opencb.opencga.storage.hadoop.variant.metadata.HBaseVariantStorageMetadataDBAdaptorFactory.buildProjectMetadataDBAdaptor(HBaseVariantStorageMetadataDBAdaptorFactory.java:38), org.opencb.opencga.storage.hadoop.variant.metadata.HBaseVariantStorageMetadataDBAdaptorFactory.buildProjectMetadataDBAdaptor(HBaseVariantStorageMetadataDBAdaptorFactory.java:13), org.opencb.opencga.storage.core.metadata.StudyConfigurationManager.<init>(StudyConfigurationManager.java:76), org.opencb.opencga.storage.hadoop.variant.HadoopVariantStorageEngine.getStudyConfigurationManager(HadoopVariantStorageEngine.java:923), org.opencb.opencga.storage.core.manager.variant.VariantStorageManager.secure(VariantStorageManager.java:512), org.opencb.opencga.storage.core.manager.variant.VariantStorageManager.get(VariantStorageManager.java:352), org.opencb.opencga.server.rest.analysis.VariantAnalysisWSService.getVariants(VariantAnalysisWSService.java:350), sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method), sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62), sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43), java.lang.reflect.Method.invoke(Method.java:498), org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(ResourceMethodInvocationHandlerFactory.java:81), org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:144), org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:161), org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$ResponseOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:160), org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:99), org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:389), org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:347), org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:102), org.glassfish.jersey.server.ServerRuntime$2.run(ServerRuntime.java:326), org.glassfish.jersey.internal.Errors$1.call(Errors.java:271), org.glassfish.jersey.internal.Errors$1.call(Errors.java:267), org.glassfish.jersey.internal.Errors.process(Errors.java:315), org.glassfish.jersey.internal.Errors.process(Errors.java:297), org.glassfish.jersey.internal.Errors.process(Errors.java:267), org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:317), org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:305), org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1154), org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:473), org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:427), org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:388), org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:341), org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:228), org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231), org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166), org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52), org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193), org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166), org.opencb.opencga.server.CORSFilter.doFilter(CORSFilter.java:46), org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193), org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166), org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:198), org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96), org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:493), org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:140), org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:81), org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:650), org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:87), org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:342), org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:800), org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66), org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:806),
org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1498), org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49), java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149), java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624), org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61), java.lang.Thread.run(Thread.java:748)]
2019-01-25 17:25:16 [http-nio-8080-exec-9] ERROR OpenCGAWSServer:447 - Catch error: Study { name: "qwfrd@bpuzh:szwwd" } not found.
org.opencb.opencga.storage.core.variant.adaptors.VariantQueryException: Study { name: "qwfrd@bpuzh:szwwd" } not found.
        at org.opencb.opencga.storage.core.variant.adaptors.VariantQueryException.studyNotFound(VariantQueryException.java:101)
        at org.opencb.opencga.storage.core.metadata.StudyConfigurationManager.getStudyId(StudyConfigurationManager.java:339)
        at org.opencb.opencga.storage.core.metadata.StudyConfigurationManager.getStudyIds(StudyConfigurationManager.java:299)
        at org.opencb.opencga.storage.core.metadata.StudyConfigurationManager.getStudyIds(StudyConfigurationManager.java:280)
        at org.opencb.opencga.storage.core.variant.adaptors.VariantQueryUtils.getIncludeStudies(VariantQueryUtils.java:608)
        at org.opencb.opencga.storage.core.variant.adaptors.VariantQueryUtils.getIncludeStudies(VariantQueryUtils.java:576)
        at org.opencb.opencga.storage.core.manager.variant.VariantStorageManager.checkSamplesPermissions(VariantStorageManager.java:558)
        at org.opencb.opencga.storage.core.manager.variant.VariantStorageManager.secure(VariantStorageManager.java:512)
        at org.opencb.opencga.storage.core.manager.variant.VariantStorageManager.get(VariantStorageManager.java:352)
        at org.opencb.opencga.server.rest.analysis.VariantAnalysisWSService.getVariants(VariantAnalysisWSService.java:350)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(ResourceMethodInvocationHandlerFactory.java:81)
        at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:144)
        at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:161)
        at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$ResponseOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:160)
        at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:99)
        at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:389)
        at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:347)
        at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:102)
        at org.glassfish.jersey.server.ServerRuntime$2.run(ServerRuntime.java:326)
        at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271)
        at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267)
        at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
        at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
        at org.glassfish.jersey.internal.Errors.process(Errors.java:267)
        at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:317)
        at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:305)
        at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1154)
        at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:473)
        at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:427)
        at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:388)
        at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:341)
        at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:228)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.opencb.opencga.server.CORSFilter.doFilter(CORSFilter.java:46)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:198)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96)
        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:493)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:140)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:81)
        at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:650)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:87)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:342)
        at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:800)
        at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66)
        at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:806)
        at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1498)
        at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
        at java.lang.Thread.run(Thread.java:748)

Changes and jar files are here: https://github.com/marrobi/opencga/tree/hack/hbase-connectivity

lawrencegripper commented 5 years ago

I think this line sounds like it's not right :(

Cannot run program "/usr/lib/hdinsight-common/scripts/decrypt.sh": error=2, No such file or directory

It looks like the Jar expects some scripts to be available.

j-coll commented 5 years ago

As discussed, the best way to proceed should be to take the version HDP-2.6.5.3007-2 from the Hortonworks repositories.

https://github.com/hortonworks/hadoop-release/tree/HDP-2.6.5.3007-2-tag http://repo.hortonworks.com/content/repositories/releases/org/apache/hadoop/hadoop-common/2.7.3.2.6.5.3007-2/ http://repo.hortonworks.com/content/repositories/releases/org/apache/hadoop/hadoop-azure/2.7.3.2.6.5.3007-2/