google-cloudsearch / norconex-committer-plugin

Google Cloud Search Norconex HTTP Collector Indexer Plugin
Apache License 2.0
5 stars 7 forks source link

Show "Unable to upload default ACL" randomly #8

Closed FcrbPeter closed 5 years ago

FcrbPeter commented 5 years ago

Hi,

I am using GKE to run the norconex crawler with this plugin. There are about 6 crawler job with a Mongodb for datastore. The jobs show below.

gcs_user@cloudshell:~/deployment (ai-gcsimpl-uat-235207)$ kubectl get pods
NAME                          READY   STATUS      RESTARTS   AGE
crawler-job-acer-tnsx8        1/1     Running     0          12h
crawler-job-acerpro-q77bk     1/1     Running     0          12h
crawler-job-community-hbzp7   1/1     Running     0          12h
crawler-job-custhelp-jgn4s    0/1     Completed   0          12h
crawler-job-datasheet-hd684   1/1     Running     0          12h
crawler-job-ec-wxdb5          1/1     Running     0          12h
mongo-0                       2/2     Running     0          10h

I meet a problem that the crawlers shows up "Unable to upload default ACL" error randomly in all crawlers. Below is the error log which show up recently. I have double checked the configuration and they are correct. I can provide them if needed.

ERROR [JobSuite] Execution failed for job: webcrawler-datasheet
com.google.enterprise.cloudsearch.sdk.StartupException: Unable to upload default ACL.
        at com.google.enterprise.cloudsearch.sdk.indexing.DefaultAcl.<init>(DefaultAcl.java:220)
        at com.google.enterprise.cloudsearch.sdk.indexing.DefaultAcl.<init>(DefaultAcl.java:93)
        at com.google.enterprise.cloudsearch.sdk.indexing.DefaultAcl$Builder.build(DefaultAcl.java:456)
        at com.google.enterprise.cloudsearch.sdk.indexing.DefaultAcl.fromConfiguration(DefaultAcl.java:266)
        at com.norconex.committer.googlecloudsearch.GoogleCloudSearchCommitter$Helper.initDefaultAclFromConfig(GoogleCloudSearchCommitter.java:378)
        at com.norconex.committer.googlecloudsearch.GoogleCloudSearchCommitter.init(GoogleCloudSearchCommitter.java:170)
        at com.norconex.committer.googlecloudsearch.GoogleCloudSearchCommitter.commitBatch(GoogleCloudSearchCommitter.java:203)
        at com.norconex.committer.core.AbstractBatchCommitter.commitAndCleanBatch(AbstractBatchCommitter.java:179)
        at com.norconex.committer.core.AbstractBatchCommitter.cacheOperationAndCommitIfReady(AbstractBatchCommitter.java:208)
        at com.norconex.committer.core.AbstractBatchCommitter.commitAddition(AbstractBatchCommitter.java:143)
        at com.norconex.committer.core.AbstractFileQueueCommitter.commit(AbstractFileQueueCommitter.java:222)
        at com.norconex.collector.core.crawler.AbstractCrawler.execute(AbstractCrawler.java:274)
        at com.norconex.collector.core.crawler.AbstractCrawler.doExecute(AbstractCrawler.java:228)
        at com.norconex.collector.core.crawler.AbstractCrawler.startExecution(AbstractCrawler.java:184)
        at com.norconex.jef4.job.AbstractResumableJob.execute(AbstractResumableJob.java:49)
        at com.norconex.jef4.suite.JobSuite.runJob(JobSuite.java:355)
        at com.norconex.jef4.suite.JobSuite.doExecute(JobSuite.java:296)
        at com.norconex.jef4.suite.JobSuite.execute(JobSuite.java:168)
        at com.norconex.collector.core.AbstractCollector.start(AbstractCollector.java:131)
        at com.norconex.collector.core.AbstractCollectorLauncher.launch(AbstractCollectorLauncher.java:95)
        at com.norconex.collector.http.HttpCollector.main(HttpCollector.java:74)
Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException
        at com.google.enterprise.cloudsearch.sdk.AsyncRequest$SettableFutureCallback.onFailure(AsyncRequest.java:134)
        at com.google.api.client.googleapis.batch.json.JsonBatchCallback.onFailure(JsonBatchCallback.java:54)
        at com.google.api.client.googleapis.batch.json.JsonBatchCallback.onFailure(JsonBatchCallback.java:50)
        at com.google.api.client.googleapis.batch.BatchUnparsedResponse.parseAndCallback(BatchUnparsedResponse.java:223)
        at com.google.api.client.googleapis.batch.BatchUnparsedResponse.parseNextResponse(BatchUnparsedResponse.java:155)
        at com.google.api.client.googleapis.batch.BatchRequest.execute(BatchRequest.java:253)
        at com.google.enterprise.cloudsearch.sdk.BatchRequestService$BatchRequestHelper.executeBatchRequest(BatchRequestService.java:427)
        at com.google.enterprise.cloudsearch.sdk.BatchRequestService$SnapshotRunnable.execute(BatchRequestService.java:297)
        at com.google.enterprise.cloudsearch.sdk.BatchRequestService$SnapshotRunnable.run(BatchRequestService.java:227)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
INFO  [JobSuite] Running webcrawler-datasheet: END (Wed Apr 17 14:29:52 UTC 2019)
FcrbPeter commented 5 years ago

I solved it with set the different "defatulAcl.name" in different crawlers.