Open nishkalavallabhi opened 6 years ago
I have identified the reason for your first problem. Unfortunately, the method responsible for creating the required folders was commented out. I have just updated the Pipeline class.
Currently, I cannot reproduce the problem with Jsoup. I hope your workaround of downloading Jsoup separately solved this issue.
Thanks for a quick response! Yes, the first problem seems solved now. Downloading jsoup fixed the second issue. However there seems to be another missing jar for apache commons compress - this is throwing an error after the downloads ran.
I have just run the first step of the Pipeline (after small updates on the code for decompressing and downloading schema files). Everything worked fine. I hope that the manual downloads of the missing jars helped. As I cannot reproduce your error and did not get similar complaints yet, I cannot work on that specific issue. But please let me know if you face more problems.
I tried to follow the steps mentioned but ran into an error when executing step 2 : java -jar Pipeline.jar path_to_config_file.txt 1
The following error message: java.nio.file.NoSuchFileException: path_to_config_file.txt at sun.nio.fs.WindowsException.translateToIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsFileSystemProvider.newByteChannel(Unknown Source) at java.nio.file.Files.newByteChannel(Unknown Source) at java.nio.file.Files.newByteChannel(Unknown Source) at java.nio.file.spi.FileSystemProvider.newInputStream(Unknown Source) at java.nio.file.Files.newInputStream(Unknown Source) at java.nio.file.Files.newBufferedReader(Unknown Source) at java.nio.file.Files.readAllLines(Unknown Source) at de.l3s.eventkg.pipeline.Config.init(Config.java:57) at de.l3s.eventkg.pipeline.Pipeline.main(Pipeline.java:52) Exception in thread "main" java.lang.NullPointerException at de.l3s.eventkg.pipeline.Pipeline.main(Pipeline.java:54)
Please Help
You need to replace "path_to_config_file.txt" with the actual path to the file containing the configuration data. The file content should be similar what is shown in the "Configuration" section in the readme (https://github.com/sgottsch/eventkg).
Hello Sir, Thanks for the reply,there was some problem with the timestamp,it worked after I updated it to a latest timestamp(the older version has been moved or updated).But it still throws some error related to WIKINEWS,as the source has been moved maybe. So It is unable to download the WIKINEWS data.Could you suggest the way out of this problem ?
Thanking You, Parth
On Thu, 14 Feb 2019 at 20:39, Simon Gottschalk notifications@github.com wrote:
You need to replace "path_to_config_file.txt" with the actual path to the file containing the configuration data. That file should look like the one shown in the "Configuration" section in the readme ( https://github.com/sgottsch/eventkg).
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/sgottsch/eventkg/issues/1#issuecomment-463662002, or mute the thread https://github.com/notifications/unsubscribe-auth/AchLDvLHKidCpS2O523sULvNWC47Sw9Wks5vNXw0gaJpZM4Yejiw .
I am trying to follow the instructions. I am getting the following error at "Step 2: Data Download"
My data path in the config file was: /Users/username/Downloads/EventKG/data/ (Obviously, I masked username). I understood the instructions in a way that indicated that this step is really where all data is downloaded. However, the exception makes me think I should actually already have a list of dump files?
This is how it looks after I created an executable jar by exporting the Pipeline class.
java -jar Pipeline.jar config.txt 1 Step 1: Download files. java.io.FileNotFoundException: /Users/username/Downloads/EventKG/data/raw_data/wikipedia/en/dump_file_list.txt (No such file or directory) at java.io.FileOutputStream.open0(Native Method) at java.io.FileOutputStream.open(FileOutputStream.java:270) at java.io.FileOutputStream.(FileOutputStream.java:213)
at java.io.FileOutputStream.(FileOutputStream.java:101)
at java.io.PrintWriter.(PrintWriter.java:184)
at de.l3s.eventkg.util.FileLoader.getWriter(FileLoader.java:161)
at de.l3s.eventkg.pipeline.RawDataDownLoader.downloadWikipediaFiles(RawDataDownLoader.java:177)
at de.l3s.eventkg.pipeline.RawDataDownLoader.downloadFiles(RawDataDownLoader.java:159)
at de.l3s.eventkg.pipeline.Pipeline.download(Pipeline.java:118)
at de.l3s.eventkg.pipeline.Pipeline.main(Pipeline.java:71)
Exception in thread "main" java.lang.NullPointerException
at de.l3s.eventkg.pipeline.RawDataDownLoader.downloadWikipediaFiles(RawDataDownLoader.java:202)
at de.l3s.eventkg.pipeline.RawDataDownLoader.downloadFiles(RawDataDownLoader.java:159)
at de.l3s.eventkg.pipeline.Pipeline.download(Pipeline.java:118)
at de.l3s.eventkg.pipeline.Pipeline.main(Pipeline.java:71)
At that point, I manually created the path of /raw_data/wikipedia/en within the data folder [Please mention this in the readme clearly], and now I run into an error with jsoup.
Exception in thread "main" java.lang.NoClassDefFoundError: org/jsoup/Jsoup at de.l3s.eventkg.pipeline.RawDataDownLoader.downloadWikipediaFiles(RawDataDownLoader.java:179) at de.l3s.eventkg.pipeline.RawDataDownLoader.downloadFiles(RawDataDownLoader.java:159) at de.l3s.eventkg.pipeline.Pipeline.download(Pipeline.java:118) at de.l3s.eventkg.pipeline.Pipeline.main(Pipeline.java:71) Caused by: java.lang.ClassNotFoundException: org.jsoup.Jsoup at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:338) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 4 more