mitdbg / aurum-datadiscovery

MIT License
74 stars 49 forks source link

NullPointerException when running quickstart with example config #132

Closed jfinkels closed 4 years ago

jfinkels commented 4 years ago

Environment: Ubuntu 19.10

Steps to reproduce: following the quickstart:

# Clone the repository.
git clone git@github.com:mitdbg/aurum-datadiscovery.git
cd aurum-datadiscovery

# Install JDK 8.
sudo apt install openjdk-8-jdk

# Build the ddprofiler.
cd ddprofiler
bash build.sh

# Install Elasticsearch 6.8.6.
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.8.6.zip
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.8.6.zip.sha512
shasum -a 512 -c elasticsearch-6.8.6.zip.sha512 
unzip elasticsearch-6.8.6.zip
elasticsearch-6.8.6/bin/elasticsearch &

# Run the ddprofiler.
bash run.sh --sources /home/jeffrey/src/aurum-datadiscovery/exampleconfig.yml

The contents of exampleconfig.yml are:

# exampleconfig.yml
api_version: 0
sources:
- name: "csv_repository"
  type: csv
  config:
    path: "/home/jeffrey/src/aurum-datadiscovery/example.csv"
    separator: ','

The contents of example.csv are:

# example.csv
First name,Last name,Birth year
John,Lennon,1940
Paul,McCartney,1942
George,Harrison,1943
Ringo,Starr,1940

What happens now:

$ bash run.sh --sources /home/jeffrey/src/aurum-datadiscovery/exampleconfig.yml
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/jeffrey/src/aurum-datadiscovery/ddprofiler/build/install/ddprofiler/lib/logback-classic-1.2.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/jeffrey/src/aurum-datadiscovery/ddprofiler/build/install/ddprofiler/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/jeffrey/src/aurum-datadiscovery/ddprofiler/build/install/ddprofiler/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [ch.qos.logback.classic.util.ContextSelectorStaticBinder]
22:56:55.857 [main] INFO core.config.ProfilerConfig - ProfilerConfig values:    num.pool.threads = 4    store.port = 9300   sources = /home/jeffrey/src/aurum-datadiscovery/exampleconfig.yml   web.server.port = 8080  num.record.read = 1000  store.type = 2  store.http.port = 9200  store.server = 127.0.0.1    error.logfile.name = error_profiler.log experimental = false    execution.mode = 0  console.metrics = -1
6
ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console. Set system property 'log4j2.debug' to show Log4j2 internal initialization logging.
22:57:00.642 [main] INFO store.NativeElasticStore - Indices already exist, moving on
22:57:00.643 [main] INFO core.Main - Using /home/jeffrey/src/aurum-datadiscovery/exampleconfig.yml as sources file
22:57:00.772 [main] INFO core.Main - Found 1 sources to profile
22:57:00.773 [main] INFO core.Main - Processing source csv_repository of type csv
Exception in thread "main" java.lang.NullPointerException
    at sources.implementations.CSVSource.processSource(CSVSource.java:96)
    at core.Main.startProfiler(Main.java:68)
    at core.Main.main(Main.java:123)

What I expected to happen: no error.

jbalint commented 4 years ago

can you try providing the path without the filename as shown in the template.yml?

    # path indicates where the CSV files live
    path: "/Users/test/data/csvrepository/"
jfinkels commented 4 years ago

You are right of course. Thanks. I got confused because I assumed path referred to a file. Maybe dir would have been clearer?