astronomy-commons / axs

Astronomy eXtensions for Spark: Fast, Scalable, Analytics of Billion+ row catalogs
https://axs.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
23 stars 12 forks source link

Cannot use local Derby database for Spark > 2.4.0 #19

Open stevenstetzler opened 4 years ago

stevenstetzler commented 4 years ago

@ctslater and I have been investigating an issue where it seems that more than one connection is being made by Spark to the Hive metastore when backed by a local Derby database. The Derby database only allows a single connection at a time, so a crash occurs when a second connection is attempted. On epyc, we use a shared MySQL database, so it seems we are blind to this issue between version changes. This bug originally appeared on our AWS JupyterHub where each user is using a local Derby database instead of a shared one.

The following is enough to reproduce the bug with Spark 3.0.0 and to work without the bug for Spark 2.4.0:

from axs import AxsCatalog, Constants

db = AxsCatalog(spark)
db.import_existing_table(
    "ztf",
    "/epyc/users/stevengs/ztf_oct19_small",
    num_buckets=500,
    zone_height=Constants.ONE_AMIN,
    import_into_spark=True
)

@ctslater to reproduce this on epyc, navigate to: /epyc/users/stevengs/spark-testing and do

source 2.4.0/env.sh
pyspark

and copy in the code above, which should work. Then do

source 3.0.0/env.sh
pyspark

and copy in the code above again, which should fail with a traceback like:

Caused by: ERROR XJ040: Failed to start database '/epyc/users/stevengs/axs/metastore_db' with class loader org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@158bc877, see the next exception for details.
    at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
    at org.apache.derby.impl.jdbc.SQLExceptionFactory.wrapArgsForTransportAcrossDRDA(Unknown Source)
    ... 115 more
Caused by: ERROR XSDB6: Another instance of Derby may have already booted the database /data/epyc/users/stevengs/axs/metastore_db.
    at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
    at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
    at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.privGetJBMSLockOnDB(Unknown Source)
    at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.run(Unknown Source)
    at java.security.AccessController.doPrivileged(Native Method)
    at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.getJBMSLockOnDB(Unknown Source)
    at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.boot(Unknown Source)
    at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
    at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
    at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
    at org.apache.derby.impl.services.monitor.FileMonitor.startModule(Unknown Source)
    at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
    at org.apache.derby.impl.store.raw.RawStore$6.run(Unknown Source)
    at java.security.AccessController.doPrivileged(Native Method)
    at org.apache.derby.impl.store.raw.RawStore.bootServiceModule(Unknown Source)
    at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
    at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
    at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
    at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
    at org.apache.derby.impl.services.monitor.FileMonitor.startModule(Unknown Source)
    at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
    at org.apache.derby.impl.store.access.RAMAccessManager$5.run(Unknown Source)
    at java.security.AccessController.doPrivileged(Native Method)
    at org.apache.derby.impl.store.access.RAMAccessManager.bootServiceModule(Unknown Source)
    at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
    at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
    at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
    at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
    at org.apache.derby.impl.services.monitor.FileMonitor.startModule(Unknown Source)
    at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
    at org.apache.derby.impl.db.BasicDatabase$5.run(Unknown Source)
    at java.security.AccessController.doPrivileged(Native Method)
    at org.apache.derby.impl.db.BasicDatabase.bootServiceModule(Unknown Source)
    at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
    at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
    at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
    at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
    at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
    at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
    at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
    at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
    at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
    at org.apache.derby.impl.jdbc.EmbedConnection$4.run(Unknown Source)
    at org.apache.derby.impl.jdbc.EmbedConnection$4.run(Unknown Source)
    at java.security.AccessController.doPrivileged(Native Method)
    at org.apache.derby.impl.jdbc.EmbedConnection.startPersistentService(Unknown Source)
    ... 112 more

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 6, in <module>
  File "/epyc/opt/spark-axs-3.0.0-beta/python/axs/catalog.py", line 83, in import_existing_table
    self.spark.catalog.createTable(table_name, path, "parquet")
  File "/epyc/opt/spark-axs-3.0.0-beta/python/pyspark/sql/catalog.py", line 162, in createTable
    df = self._jcatalog.createTable(tableName, source, options)
  File "/epyc/opt/spark-axs-3.0.0-beta/python/lib/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", line 1286, in __call__
  File "/epyc/opt/spark-axs-3.0.0-beta/python/pyspark/sql/utils.py", line 102, in deco
    raise converted
pyspark.sql.utils.AnalysisException: java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient;

the source files simply change SPARK_HOME, PATH, and SPARK_CONF_DIR to point to the right version of Spark on epyc and also to load special configurations in hive-site.xml that specify using a local Derby database instead of the database running on epyc.

The following is the hive-site.xml for 2.4.0

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<configuration>
  <property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:derby:/epyc/users/stevengs/spark-testing/2.4.0/metastore_db;create=true</value>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>org.apache.derby.jdbc.EmbeddedDriver</value>
  </property>
</configuration>

and the following is the hive-site.xml for 3.0.0

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<configuration>
  <property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:derby:/epyc/users/stevengs/spark-testing/3.0.0/metastore_db;create=true</value>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>org.apache.derby.jdbc.EmbeddedDriver</value>
  </property>
</configuration>
ctslater commented 4 years ago

I seem to have gotten it to work by adding --conf spark.sql.hive.metastore.sharedPrefixes=org.apache.derby to the pyspark command line; want to check if that also works for you?

stevenstetzler commented 4 years ago

It looks like this fixes the issue for me on both Epyc and the cloud system, thanks @ctslater. I guess this should be a default configuration in the spark-defaults.conf that we ship with AXS?