Closed ghost closed 6 years ago
I thought it's FeatureExtractor, so I tried to run this code:
:require lib/jsoup-custom.jar
:require lib/knowitall-cluewebextractor.jar
:require target/scala-2.10/boilerplate_2.10-2.0-SNAPSHOT.jar
import ch.ethz.dalab.web2text.cdom.CDOM
import ch.ethz.dalab.web2text.features.{FeatureExtractor, PageFeatures}
import ch.ethz.dalab.web2text.features.extractor._
import ch.ethz.dalab.web2text.alignment.Alignment
import ch.ethz.dalab.web2text.utilities.Util
import ch.ethz.dalab.web2text.cleaneval.CleanEval
import ch.ethz.dalab.web2text.output.CsvDatasetWriter
val unaryExtractor = DuplicateCountsExtractor + LeafBlockExtractor + AncestorExtractor(NodeBlockExtractor + TagExtractor(mode="node"), 1) + AncestorExtractor(NodeBlockExtractor, 2) + RootExtractor(NodeBlockExtractor) + TagExtractor(mode="leaf")
val pairwiseExtractor = TreeDistanceExtractor + BlockBreakExtractor + CommonAncestorExtractor(NodeBlockExtractor)
val extractor = FeatureExtractor(unaryExtractor, pairwiseExtractor)
val data = Util.time{ CleanEval.dataset(extractor) }
but this error happened:
error: missing or invalid dependency detected while loading class file 'PageFeatures.class'.
Could not access term breeze in package <root>,
because it (or its dependencies) are missing. Check your build definition for
missing or conflicting dependencies. (Re-run with `-Ylog-classpath` to see the problematic classpath.)
A full rebuild may help if 'PageFeatures.class' was compiled against an incompatible version of <root>.
error: missing or invalid dependency detected while loading class file 'PageFeatures.class'.
Could not access term linalg in value breeze,
because it (or its dependencies) are missing. Check your build definition for
missing or conflicting dependencies. (Re-run with `-Ylog-classpath` to see the problematic classpath.)
A full rebuild may help if 'PageFeatures.class' was compiled against an incompatible version of breeze.
What is that?
You are right: fe
stands for FeatureExtractor
.
The error you are getting looks like there is a missing dependency. Could you please try adding this dependency:
https://github.com/scalanlp/breeze/wiki/Installation
@tvogels I ran this before running it:
$ sbt
set scalaVersion := "2.10.4" // or 2.11.5
set libraryDependencies += "org.scalanlp" %% "breeze" % "0.12"
set libraryDependencies += "org.scalanlp" %% "breeze-viz" % "0.12"
set resolvers += "Sonatype Releases" at "https://oss.sonatype.org/content/repositories/releases/"
console
and then, tried to rerun the same code, and finally, It works!
Thank you so much!
I want to extract CleanEval data, and it might be explained on README, like this:
but I don't understand what "fe" is. Could you explain how to define "fe" ?