archivesunleashed / aut

The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
https://aut.docs.archivesunleashed.org/
Apache License 2.0
137 stars 33 forks source link

Remaining Matchbox implementations for Scala #415

Closed SinghGursimran closed 4 years ago

SinghGursimran commented 4 years ago

Remaining Matchbox implementations for Scala

223

For Testing

ExtractImageDetailsDF

import io.archivesunleashed._
import io.archivesunleashed.app._

val df = RecordLoader.loadArchives("./src/test/resources/arc/example.arc.gz",sc)
                     .keepImages()
                     .all()

ExtractImageDetailsDF(df).show(10)
codecov[bot] commented 4 years ago

Codecov Report

Merging #415 into master will increase coverage by 0.25%. The diff coverage is 100%.

@@            Coverage Diff             @@
##           master     #415      +/-   ##
==========================================
+ Coverage    77.7%   77.96%   +0.25%     
==========================================
  Files          40       41       +1     
  Lines        1552     1570      +18     
  Branches      292      292              
==========================================
+ Hits         1206     1224      +18     
  Misses        218      218              
  Partials      128      128