VizierDB / vizier-scala

The Vizier kernel-free notebook programming environment
Other
34 stars 11 forks source link

Running vizier with -w Gives a null pointer exception error #265

Closed shawnz99 closed 1 year ago

shawnz99 commented 1 year ago

Describe the bug When running vizier on a notebook like mill vizier -w test_data/dependency_test/notebookTest ingest test_data/dependency_test/notebookTest/test.ipynb. This also happens using the -d argument like mill vizier -d vizier.db -w test_data/dependency_test/notebookTest ingest test_data/dependency_test/notebookTest/test.ipynb. This error does not occur when not using the -w option

To Reproduce Steps to reproduce the behavior:

  1. Go to project root directory
  2. run vizier using the -w option with relative path
  3. See error

Expected behavior Expect it to consider the path given as the working directory

Screenshots

❯ ./mill vizier -d vizier.db -w  test_data/dependency_test/notebookTest ingest test_data/dependency_test/notebookTest/test.ipynb
[115/115] vizier.run 
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/shawnz99/.cache/coursier/v1/https/repo1.maven.org/maven2/ch/qos/logback/logback-classic/1.2.10/logback-classic-1.2.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/shawnz99/.cache/coursier/v1/https/repo1.maven.org/maven2/org/apache/logging/log4j/log4j-slf4j-impl/2.17.2/log4j-slf4j-impl-2.17.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [ch.qos.logback.classic.util.ContextSelectorStaticBinder]
Checking for dependencies...

Your installed python is missing dependencies. Python cells may not work properly.

The following command will install required dependencies.
  python3 -m pip install 'pyarrow' 'shapely' 'pyspark=3.3.1'
Setting up project library...
test_data/dependency_test/notebookTest
Starting Spark...
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/home/shawnz99/.cache/coursier/v1/https/repo1.maven.org/maven2/org/apache/spark/spark-unsafe_2.12/3.3.1/spark-unsafe_2.12-3.3.1.jar) to constructor java.nio.DirectByteBuffer(long,int)
WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
16:10:45.707 [main] WARN  org.apache.spark.util.Utils - Your hostname, framework resolves to a loopback address: 127.0.1.1; using 192.168.1.18 instead (on interface wlp166s0)
16:10:45.714 [main] WARN  org.apache.spark.util.Utils - Set SPARK_LOCAL_IP if you need to bind to another address
16:10:46.338 [main] WARN  o.a.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16:10:50.005 [main] WARN  o.a.spark.sql.internal.SharedState - Cannot qualify the warehouse path, leaving it unqualified.
java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:137)
        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3467)
        at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:174)
        at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3574)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3521)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:540)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:288)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:524)
        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:365)
        at org.apache.spark.sql.internal.SharedState$.qualifyWarehousePath(SharedState.scala:282)
        at org.apache.spark.sql.internal.SharedState.liftedTree1$1(SharedState.scala:80)
        at org.apache.spark.sql.internal.SharedState.<init>(SharedState.scala:79)
        at org.apache.spark.sql.SparkSession.$anonfun$sharedState$1(SparkSession.scala:143)
        at scala.Option.getOrElse(Option.scala:189)
        at org.apache.spark.sql.SparkSession.sharedState$lzycompute(SparkSession.scala:143)
        at org.apache.spark.sql.SparkSession.sharedState(SparkSession.scala:142)
        at info.vizierdb.spark.InitSpark$.local(InitSpark.scala:65)
        at info.vizierdb.Vizier$.initSpark(Vizier.scala:104)
        at info.vizierdb.Vizier$.main(Vizier.scala:199)
        at info.vizierdb.Vizier.main(Vizier.scala)
Caused by: java.lang.reflect.InvocationTargetException: null
        at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:135)
        ... 19 common frames omitted
Caused by: java.lang.NullPointerException: null
        at org.apache.hadoop.fs.Path.<init>(Path.java:150)
        at org.apache.hadoop.fs.Path.makeQualified(Path.java:543)
        at org.apache.hadoop.fs.FileSystem.makeQualified(FileSystem.java:667)
        at org.apache.hadoop.fs.RawLocalFileSystem.getInitialWorkingDirectory(RawLocalFileSystem.java:725)
        at org.apache.hadoop.fs.RawLocalFileSystem.<init>(RawLocalFileSystem.java:92)
        at org.apache.hadoop.fs.LocalFileSystem.<init>(LocalFileSystem.java:41)
        ... 24 common frames omitted
Exception in thread "main" java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:137)
        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3467)
        at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:174)
        at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3574)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3521)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:540)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:288)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:524)
        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:365)
        at org.apache.spark.sql.catalyst.catalog.InMemoryCatalog.liftedTree1$1(InMemoryCatalog.scala:122)
        at org.apache.spark.sql.catalyst.catalog.InMemoryCatalog.createDatabase(InMemoryCatalog.scala:120)
        at org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:153)
        at org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:140)
        at info.vizierdb.spark.InitSpark$.local(InitSpark.scala:65)
        at info.vizierdb.Vizier$.initSpark(Vizier.scala:104)
        at info.vizierdb.Vizier$.main(Vizier.scala:199)
        at info.vizierdb.Vizier.main(Vizier.scala)
Caused by: java.lang.reflect.InvocationTargetException
        at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:135)
        ... 16 more
Caused by: java.lang.NullPointerException
        at org.apache.hadoop.fs.Path.<init>(Path.java:150)
        at org.apache.hadoop.fs.Path.makeQualified(Path.java:543)
        at org.apache.hadoop.fs.FileSystem.makeQualified(FileSystem.java:667)
        at org.apache.hadoop.fs.RawLocalFileSystem.getInitialWorkingDirectory(RawLocalFileSystem.java:725)
        at org.apache.hadoop.fs.RawLocalFileSystem.<init>(RawLocalFileSystem.java:92)
        at org.apache.hadoop.fs.LocalFileSystem.<init>(LocalFileSystem.java:41)
        ... 21 more
1 targets failed
vizier.run subprocess failed

OS: Linux framework 6.2.0-26-generic #26~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Jul 13 16:27:29 UTC 2 x86_64 x86_64 x86_64 GNU/Linux Browser: N/A Java version:openjdk version "11.0.20" 2023-07-18 OpenJDK Runtime Environment (build 11.0.20+8-post-Ubuntu-1ubuntu122.04) OpenJDK 64-Bit Server VM (build 11.0.20+8-post-Ubuntu-1ubuntu122.04, mixed mode, sharing) Vizier version: Vizier-Scala 2.0.0-SNAPSHOT (c) 2021 U. Buffalo, NYU, Ill. Inst. Tech., and Breadcrumb Analytics

Additional context Add any other context about the problem here.

okennedy commented 1 year ago

Fixed in 5cd769267ace08da0b470e21406ddd50af861faa