archivesunleashed / aut

The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
https://aut.docs.archivesunleashed.org/
Apache License 2.0
137 stars 33 forks source link

Add imagegraph, and webgraph to command line app. #432

Closed ruebot closed 4 years ago

ruebot commented 4 years ago

GitHub issue(s): #431

What does this Pull Request do?

Add imagegraph, and webgraph to command line app.

How should this be tested?

ImageGraphExtractor example:

WebPagesExtractor example:

Additional Notes:

  1. There is the side issue of the log4j config file required. I'll create a ticket for that, and work on a solution separately.
  2. I also a separate issue, I noticed that DomainGraphExtractor writes as GEXF. Anybody remember why this using GEXF? Shouldn’t it be graphml? https://github.com/archivesunleashed/aut/blob/master/src/main/scala/io/archivesunleashed/app/CommandLineApp.scala#L115-L123
  3. I haven't added webgraph and domains because there is a bit of duplication with DomainFrequencyExtractor and DomainGraphExtractor , and there could also be some confusion here how the extractors are labeled.
  4. Once we sort out a bit of the above, I'll contiue working on the documentation update branch I have locally for https://github.com/archivesunleashed/aut-docs/issues/14
codecov[bot] commented 4 years ago

Codecov Report

Merging #432 into master will increase coverage by 0.28%. The diff coverage is 96.29%.

@@            Coverage Diff             @@
##           master     #432      +/-   ##
==========================================
+ Coverage   77.70%   77.99%   +0.28%     
==========================================
  Files          41       43       +2     
  Lines        1534     1554      +20     
  Branches      282      286       +4     
==========================================
+ Hits         1192     1212      +20     
  Misses        217      217              
  Partials      125      125              
ruebot commented 4 years ago

433 has been created (log4j configuration issue).