Norconex / crawlers

Norconex Crawlers (or spiders) are flexible web and filesystem crawlers for collecting, parsing, and manipulating data from the web or filesystem to various data repositories such as search engines.
https://opensource.norconex.com/crawlers
Apache License 2.0
183 stars 67 forks source link

Could not find artifact edu.ucar:jj2000:jar:5.3 in central #440

Closed danizen closed 3 years ago

danizen commented 6 years ago

When I check oss-sonatype and maven central, I see only 5.2, and cannot build without masking in my pom. Hard to mask, because I don't know what depends on 5.3, so I don't know where to exclude. Working on it.

danizen commented 6 years ago

maven.repository.com also shows only 5.2 - http://maven-repository.com/search?q=jj2000

douglas-andrew-harley commented 6 years ago

https://mvnrepository.com/artifact/edu.ucar/jj2000/5.3

Appears that this is hosted in the Boundless Geo repo.

danizen commented 6 years ago

@douglas-andrew-harley , thanks I see that importer is pulling it in indirectly through the pdfbox-debugger jar.

At least now I can mask it out, but being on-premise in the U.S. Federal Government, I cannot use the Boundless Geo repo anyway.

[INFO] +- com.norconex.collectors:norconex-importer:jar:2.8.0:compile
[INFO] |  +- commons-codec:commons-codec:jar:1.10:compile
[INFO] |  +- org.apache.pdfbox:pdfbox:jar:2.0.7:compile
[INFO] |  |  \- org.apache.pdfbox:fontbox:jar:2.0.7:compile
[INFO] |  +- org.apache.pdfbox:xmpbox:jar:2.0.7:compile
[INFO] |  +- org.apache.pdfbox:pdfbox-tools:jar:2.0.7:compile
[INFO] |  |  \- org.apache.pdfbox:pdfbox-debugger:jar:2.0.7:compile
[INFO] |  +- org.jsoup:jsoup:jar:1.10.3:compile
[INFO] |  +- org.apache.httpcomponents:httpclient:jar:4.5.3:compile
[INFO] |  +- org.apache.httpcomponents:httpmime:jar:4.5.3:compile
[INFO] |  +- org.apache.httpcomponents:httpcore:jar:4.4.6:compile
[INFO] |  +- joda-time:joda-time:jar:2.9.9:compile
[INFO] |  +- edu.ucar:jj2000:jar:5.3:compile
danizen commented 6 years ago

It does seem better if norconex uses only central and oss-sonatype, so that it requires less to build, but this is a pretty low priority issue.

If anyone can suggest an improvement in dependency management to the excerpt below, I'm all attentiveness. Although I hold my own, I'm not a maven maven:

    <dependency>
      <groupId>com.norconex.collectors</groupId>
      <artifactId>norconex-importer</artifactId>
      <version>${norconex.importer.version}</version>
      <exclusions>
        <exclusion>
          <!-- We will get these classes somewhere else -->
          <groupId>de.l3s.boilerpipe</groupId>
          <artifactId>boilerpipe</artifactId>
        </exclusion>
        <exclusion>
          <!-- edu.ucar:jj2000:5.3 not available in standard repos -->
          <groupId>edu.ucar</groupId>
          <artifactId>jj2000</artifactId>
        </exclusion>
      </exclusions>
    </dependency>

    <!-- From here, in fact, later version, but earlier xerces than we need -->
    <dependency>
      <groupId>com.syncthemall</groupId>
      <artifactId>boilerpipe</artifactId>
      <version>1.2.2</version>
      <exclusions>
        <exclusion>
          <groupId>xerces</groupId>
          <artifactId>xercesImpl</artifactId>
        </exclusion>
      </exclusions>
    </dependency>

    <!-- Use older version of edu.ucar:jj2000 -->
    <dependency>
      <groupId>edu.ucar</groupId>
      <artifactId>jj2000</artifactId>
      <version>5.2</version>
    </dependency>
stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.