ukwa / webarchive-discovery

WARC and ARC indexing and discovery tools.
https://github.com/ukwa/webarchive-discovery/wiki
113 stars 24 forks source link

Exception in thread "main" java.lang.NoSuchFieldError: LUCENE_8_8_2 #302

Closed steph-nb closed 1 year ago

steph-nb commented 1 year ago

Hello there,

When adhering to the Quick Start manual: https://github.com/ukwa/webarchive-discovery/wiki/Quick-Start to run a webarchive-discovery (master) locally, I face this issue:

Parsing Archive File [1/1]:../WAS-340616-20190514071541349-00000-~~.arc.gz Exception in thread "main" java.lang.NoSuchFieldError: LUCENE_8_8_2 at org.opensearch.Version.(Version.java:73) at org.opensearch.OpenSearchException.(OpenSearchException.java:79) at org.opensearch.client.RestHighLevelClient.parseResponseException(RestHighLevelClient.java:1653) at org.opensearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1407) at org.opensearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1364) at org.opensearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:1334) at org.opensearch.client.RestHighLevelClient.bulk(RestHighLevelClient.java:366) at uk.bl.wa.opensearch.OpensearchImporter.importDocuments(OpensearchImporter.java:112) at uk.bl.wa.indexer.delivery.OpensearchDocumentConsumer.performFlush(OpensearchDocumentConsumer.java:94) at uk.bl.wa.indexer.delivery.BufferedDocumentConsumer.flush(BufferedDocumentConsumer.java:126) at uk.bl.wa.indexer.delivery.BufferedDocumentConsumer.add(BufferedDocumentConsumer.java:106) at uk.bl.wa.indexer.WARCIndexerCommand.parseWarcFiles(WARCIndexerCommand.java:243) at uk.bl.wa.indexer.WARCIndexerCommand.main(WARCIndexerCommand.java:132)

by the way, I am working on Ubuntu 18.04 I have already tried with replacing solr (8.7.0 and 8.8.2)

Any ideas what I should change?

Thanks and BR, Stephan

aponb commented 1 year ago

Is it possible to get WAS-340616-20190514071541349-00000-~~.arc.gz to reproduce the exception?

steph-nb commented 1 year ago

Hi aponb,

I was no longer able to reproduce this error... Sorry for that one. I just do not know what happended then...

But whilst retesting I found that mistake in the Quick-Start, which took me a while to spot:

The URL for the -s parameter is missing /solr:

java -jar target/warc-indexer-*-jar-with-dependencies.jar -s http://localhost:8983/discovery/ src/test/resources/wikipedia-mona-lisa/flashfrozen-jwat-recompressed.warc.gz

but correct would be: java -jar target/warc-indexer-*-jar-with-dependencies.jar -s http://localhost:8983/solr/discovery/ src/test/resources/wikipedia-mona-lisa/flashfrozen-jwat-recompressed.warc.gz

BR and thanks a lot, Stephan

anjackson commented 1 year ago

Thanks for that - I've updated the Quick Start: https://github.com/ukwa/webarchive-discovery/wiki/Quick-Start#indexing-a-warc-file