archivesunleashed / aut

The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
https://aut.docs.archivesunleashed.org/
Apache License 2.0
137 stars 33 forks source link

Release 0.80.0 JAR produces error; built 0.80.1 fatjar built on repo works #495

Closed ianmilligan1 closed 4 years ago

ianmilligan1 commented 4 years ago

Describe the bug In making some documentation, we've discovered that the release JAR produces an error whereas the built one does not. This was tested on both my machine and @SamFritz's machine. Sarah McTavish originally discovered it while putting some stuff together.

When using aut-0.80.0-fatjar.jar from the 0.80.0 AUT release, most scripts lead to an error. For example:

import io.archivesunleashed._
import io.archivesunleashed.udfs._

RecordLoader.loadArchives("/Users/ianmilligan1/dropbox/git/aut-resources/Sample-Data/*.gz", sc).webpages()
  .select($"url")
  .show(20, false)

Leads to

// Exiting paste mode, now interpreting.

java.lang.NoSuchMethodError: scala.Predef$.refArrayOps([Ljava/lang/Object;)Lscala/collection/mutable/ArrayOps;
  at io.archivesunleashed.package$RecordLoader$.getFiles(package.scala:62)
  at io.archivesunleashed.package$RecordLoader$.loadArchives(package.scala:76)
  ... 55 elided

However, when using aut-0.80.1-SNAPSHOT-fatjar.jar as generated by cloning and building this repository (mvn clean install), the exact same script leads to:

+---------------------------------------------------------------------------------------------------------------------+
|url                                                                                                                  |
+---------------------------------------------------------------------------------------------------------------------+
|http://www.gca.ca/indexcms/?organizations&orgid=27                                                                   |
|http://www.ppforum.com/en/speeches/index.asp?theme=all&year=2003                                                     |
|http://communist-party.ca/calendar/cal_week.php?op=week&date=2006-08-18&catview=0                                    |
|http://www.canadafirst.net/immi_crime/canada_terrorist_destination.html                                              |
|http://www.web.net/~ccr/edboardmtg.html                                                                              |
|http://www.ccsd.ca/francais/pubs/2003/psi/                                                                           |
|http://www.policyalternatives.ca/saskatchewan_office_how_to_reach_us/index.cfm                                       |
|http://www.afn.ca/article.asp?id=676                                                                                 |
|http://greenparty.ca/calendar~calendar%5Bview%5D~day~month~02~year~2006~day~06.html                                  |
|http://www.equalvoice.ca/news_021304.htm                                                                             |
|http://communist-party.ca/calendar/cal_week.php?op=week&date=2006-07-21&catview=0                                    |
|http://www.canadafirst.net/immi_crime/porous_borders_make_mean_streets.html                                          |
|http://www.canadians.org/display_document.htm?COC_token=23@@8dee865f49abbc399a37a5352ea2e2fc&id=652&isdoc=1&catid=138|
|https://liberal.ca/default_f.aspx                                                                                    |
|http://westernblockparty.com/forum/login.php?redirect=privmsg.php&folder=inbox&mode=post&u=13                        |
|http://www.ppforum.com/en/speeches/index.asp?theme=all&year=2002                                                     |
|http://liberal.ca/bio_f.aspx?&id=35090                                                                               |
|http://www.ccsd.ca/francais/pubs/2001/pec/                                                                           |
|http://www.egale.ca/index.asp?lang=E&menu=1985                                                                       |
|http://canadianactionparty.ca/temp/leader-messages/Making_It_Count.asp                                               |
+---------------------------------------------------------------------------------------------------------------------+
only showing top 20 rows

As noted, we tried this on both Sam and mine's machine and got the same results.

To Reproduce Steps to reproduce the behavior (e.g.):

  1. Download the 0.80.0 jar from the 0.80.0 release
  2. Run the above script on it
  3. Watch it fail?
  4. Build the recent main using your own variation of mvn clean install
  5. Run the above script on it
  6. Watch it succeed?

Expected behavior Should work.

Hopefully it can be as simple as replacing the release jar?

Environment information

Additional context

ruebot commented 4 years ago

The release jar is only supports Java 8. We don't have an official release that supports Java 11 yet, only if you build from source on HEAD.

tl;dr;

ianmilligan1 commented 4 years ago

Well that's a facepalm! Thanks @ruebot and sorry to waste your time on that.