USCDataScience / sparkler

Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
http://irds.usc.edu/sparkler/
Apache License 2.0
410 stars 143 forks source link

Sparkler-93: YAML backed config with Schema validation #110

Closed thammegowda closed 4 years ago

thammegowda commented 7 years ago

What changes were proposed in this pull request?

This PR replaces current JSON based config system with pure YAML based config system that supports type checking and schema validation.

Taking changes from #109, integrated it with Sparkler system, completely replaced the older config with newer one.

Reorganized code, for instance config package is added and it is exposed to plugins

Is this related to an already existing issue on sparkler?
Related to #93 and #109

Will it close an existing issue?
Closes #93 Closes #109

How was this patch tested?

A few new tests are added by @SHASHANK-PRO-05 in #109 The existing test are kept intact All tests passed for mvn clean test

thammegowda commented 7 years ago

@SHASHANK-PRO-05 @karanjeets Give this a try and review the changes...

sk-s-hub commented 7 years ago

@thammegowda I am getting a java.lang.LinkageError: loader constraint violation: loader (instance of org/apache/felix/framework/BundleWiringImpl$BundleClassLoaderJava5) previously initiated loading for a different type with name "edu/usc/irds/sparkler/config/SparklerConfig" at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:763) at org.apache.felix.framework.BundleWiringImpl$BundleClassLoader.defineClass(BundleWiringImpl.java:2370) at org.apache.felix.framework.BundleWiringImpl$BundleClassLoader.findClass(BundleWiringImpl.java:2154) at org.apache.felix.framework.BundleWiringImpl.findClassOrResourceByDelegation(BundleWiringImpl.java:1542) at org.apache.felix.framework.BundleWiringImpl.access$400(BundleWiringImpl.java:79) at org.apache.felix.framework.BundleWiringImpl$BundleClassLoader.loadClass(BundleWiringImpl.java:2018) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at edu.usc.irds.sparkler.util.FetcherDefault.init(FetcherDefault.java:....... probably SparklerConfig is loaded twice from classLoader

Steps to reproduce

thammegowda commented 7 years ago

@SHASHANK-PRO-05 Not sure. Investigating

chrismattmann commented 4 years ago

closing since it's been years and this is way out of date