LAW-Unimi / BUbiNG

The LAW next generation crawler.
http://law.di.unimi.it/software.php#bubing
Apache License 2.0
85 stars 24 forks source link

ivy.xml outdated #24

Closed dennis-kao closed 3 years ago

dennis-kao commented 3 years ago

Hi all,

I've been running into some trouble trying to build this program. I have downloaded the appropriate dependencies using ant ivy-setupjars and tried running ant compile.

It fails with: [javac] location: package org.apache.commons.io.input [javac] /root/BUbiNG/src/it/unimi/di/law/bubing/util/URLRespectsRobots.java:13: error: cannot find symbol [javac] import org.apache.commons.io.input.BOMInputStream; [javac] ^ [javac] symbol: class BOMInputStream [javac] location: package org.apache.commons.io.input [javac] /root/BUbiNG/src/it/unimi/di/law/bubing/util/FetchData.java:309: error: incompatible types: Charset cannot be converted to String [javac] fakeEntity.setContent(IOUtils.toInputStream(content, Charsets.ISO_8859_1)); [javac] ^ [javac] /root/BUbiNG/src/it/unimi/di/law/bubing/parser/SpamTextProcessor.java:69: error: cannot find symbol [javac] fbr.setReader(new CharSequenceReader(csq)); [javac] ^ [javac] symbol: class CharSequenceReader [javac] location: class SpamTextProcessor [javac] /root/BUbiNG/src/it/unimi/di/law/bubing/parser/SpamTextProcessor.java:76: error: cannot find symbol [javac] fbr.setReader(new CharSequenceReader(csq.subSequence(start, end))); [javac] ^ [javac] symbol: class CharSequenceReader [javac] location: class SpamTextProcessor [javac] /root/BUbiNG/src/it/unimi/di/law/bubing/util/URLRespectsRobots.java:183: error: cannot find symbol [javac] BOMInputStream bomInputStream = new BOMInputStream(robotsResponse.response().getEntity().getContent(), true); [javac] ^ [javac] symbol: class BOMInputStream [javac] location: class URLRespectsRobots [javac] /root/BUbiNG/src/it/unimi/di/law/bubing/util/URLRespectsRobots.java:183: error: cannot find symbol [javac] BOMInputStream bomInputStream = new BOMInputStream(robotsResponse.response().getEntity().getContent(), true); [javac] ^ [javac] symbol: class BOMInputStream [javac] location: class URLRespectsRobots [javac] /root/BUbiNG/src/it/unimi/di/law/warc/filters/ResponseMatches.java:49: error: no suitable method found for toString(InputStream,Charset) [javac] return pattern.matcher(IOUtils.toString(content, StandardCharsets.ISO_8859_1)).matches(); [javac] ^ [javac] method IOUtils.toString(InputStream,String) is not applicable [javac] (argument mismatch; Charset cannot be converted to String) [javac] method IOUtils.toString(byte[],String) is not applicable [javac] (argument mismatch; InputStream cannot be converted to byte[]) [javac] Note: /root/BUbiNG/src/it/unimi/di/law/bubing/frontier/Frontier.java uses or overrides a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. [javac] Note: Some messages have been simplified; recompile with -Xdiags:verbose to get full output [javac] 8 errors [javac] 1 warning

After checking out the package, commons-io.jar, I discovered that the package was indeed missing those classes. By adding this to ivy.xml:

`

            <dependency org="commons-io" name="commons-io" rev="2.6"/>

`

I was able to resolve quite a few errors, but this popped up:

[javac] Compiling 145 source files to /root/BUbiNG/build [javac] warning: [options] bootstrap class path not set in conjunction with -source 8 [javac] /root/BUbiNG/src/it/unimi/di/law/bubing/frontier/Frontier.java:916: error: unreported exception ConfigurationException; must be caught or declared to be thrown [javac] scalarData.save(new File(snapDir, "frontier.data")); [javac] ^ [javac] /root/BUbiNG/src/it/unimi/di/law/bubing/frontier/Frontier.java:963: error: unreported exception ConfigurationException; must be caught or declared to be thrown [javac] final Properties scalarData = new Properties(new File(snapDir, "frontier.data")); [javac] ^ [javac] Note: /root/BUbiNG/src/it/unimi/di/law/bubing/frontier/Frontier.java uses or overrides a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. [javac] 2 errors [javac] 1 warning

I've tried the follow commons-io versions so far with no success:

2.3 1.3.2 2.5 2.0 1.4 2.2 2.1

I believe this dependency issue popped up because commons-io is not explicitly specified in ivy.xml but I could be wrong --- I'm not a Java dev by any means.

Am I doing something wrong? If not, could someone list the dependencies used and their version numbers?

dennis-kao commented 3 years ago

Workaround is to download the jar files from here and compile the bin yourself:

http://law.di.unimi.it/software/download/

Jar files are platform independent so this should work across all operating systems!

vigna commented 3 years ago

You're totally right. I went through all our stack of software (fastutil, DSI utilities, etc.) removing unused dependencies and commons-io was one of those. It is no longer inherited. Also, the upgrade to commons-configuration2 made necessary a one-line fix for dependencies. Everything should work now out of the box—please write again if you meet any problem.