Esri / geoportal-server-harvester

Metadata Harvester for Esri Geoportal Server
http://esri.github.io/geoportal-server/
Apache License 2.0
31 stars 24 forks source link

Files not harvested to the geoportal #119

Closed Durga07 closed 3 years ago

Durga07 commented 3 years ago

I tried harvesting the files from Arcgis online to the Geoportal 2.6.3 through the Harvester 2.6.3. In the home section, it is said that the task is 'completed' but the file is not harvested to the Geoportal.(I'm not using the local system on which geoportal is installed, I'm using my Personal Computer) This is the exported data of the task performed: { "name": "", "source": { "type": "AGP-IN", "label": "ArcgisonlineD", "properties": { "agp-host-url": "https://arcg.is/0GqLzP", "agp-folder-id": "Trail Geoportal", "cred-username": "Durga7", "cred-password": "zzzzz", "agp-emit-xml": "true", "agp-emit-xml-fmt": "DEFAULT", "agp-emit-json": "false", "agp-max-redirects": "5" }, "keywords": [], "ref": "4bb5f339-0910-438f-9b19-6a7601ea1a4f" }, "destinations": [ { "action": { "type": "GPT", "label": "X", "properties": { "gpt-host-url": "http://49.207.9.178:8080/geoportal", "cred-username": "publisher", "cred-password": "zzzzz", "gpt-index": "", "gpt-cleanup": "true", "gpt-accept-xml": "false", "gpt-accept-json": "false", "gpt-translate-pdf": "false" }, "keywords": [], "ref": "5cee0cbd-07cb-43fb-8f93-4ec404b26204" } } ], "keywords": [], "incremental": false, "ignoreRobotsTxt": false, "ref": "bc376995-ada1-4ac7-9bfe-3be9232c97d6" }

mhogeweg commented 3 years ago

hi, the agp-host-url points to a webmap. you need to just point to the root ArcGIS Online URL as indicated in the sample below the input box for the URL.

Durga07 commented 3 years ago

I tried it this way, the file is still not harvested. Is this correct? Harvester1

mhogeweg commented 3 years ago

the URL should be https (we'll update the example in the next release). Is your account Durga7 an ArcGIS Online user account (not the geoportal account, right)

Durga07 commented 3 years ago

Yes, Durga7 is Arcgis Online user account

mhogeweg commented 3 years ago

ah, one other thing. the Folder field is a reference to the folder in ArcGIS Online. To get the proper folder id (not the name), access your ArcGIS Online organization and then in your Content tab, select the folder in the list of folders on the left under 'My Content'. Then you will see the folder id in the URL as the folder parameter

Durga07 commented 3 years ago

Is this it? Screenshot (267)

Still, the file is not harvested

mhogeweg commented 3 years ago

yes, that would be the ID

Durga07 commented 3 years ago

Hi, this is how it came. The files are not updated to the Geoportal Harvester2

mhogeweg commented 3 years ago

when you select the link of the id on the right what do you see? also, can you check the harvester AND geoportal logs? some of the validation happens in the geoportal and not in the harvester.

Durga07 commented 3 years ago

Hi, the id doesn't navigate me to anywhere

Log file of Harvester: hrv.2020-08-04.log

com.esri.geoportal.harvester.api.ex.DataOutputException: Error publishing data: id: 809f1de4ad424eadb86cb76ab63a8d36, modified: Mon Aug 03 10:47:24 IST 2020, source URI: 809f1de4ad424eadb86cb76ab63a8d36, broker URI: AGP:https://www.arcgis.com at com.esri.geoportal.harvester.gpt.GptBroker.publish(GptBroker.java:214) at com.esri.geoportal.harvester.api.base.BrokerLinkActionAdaptor.push(BrokerLinkActionAdaptor.java:64) at com.esri.geoportal.harvester.api.base.SimpleLink.push(SimpleLink.java:71) at com.esri.geoportal.harvester.engine.defaults.DefaultProcessor$DefaultProcess.lambda$null$0(DefaultProcessor.java:158) at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(Unknown Source) at java.util.stream.ReferencePipeline$Head.forEach(Unknown Source) at com.esri.geoportal.harvester.engine.defaults.DefaultProcessor$DefaultProcess.lambda$new$1(DefaultProcessor.java:156) at java.lang.Thread.run(Unknown Source) Caused by: javax.net.ssl.SSLException: Unrecognized SSL message, plaintext connection? at sun.security.ssl.InputRecord.handleUnknownRecord(Unknown Source) at sun.security.ssl.InputRecord.read(Unknown Source) at sun.security.ssl.SSLSocketImpl.readRecord(Unknown Source) at sun.security.ssl.SSLSocketImpl.performInitialHandshake(Unknown Source) at sun.security.ssl.SSLSocketImpl.startHandshake(Unknown Source) at sun.security.ssl.SSLSocketImpl.startHandshake(Unknown Source) at org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:436) at org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:384) at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142) at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:374) at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393) at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236) at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186) at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108) at com.esri.geoportal.commons.gpt.client.Client.execute(Client.java:700) at com.esri.geoportal.commons.gpt.client.Client.generateToken(Client.java:754) at com.esri.geoportal.commons.gpt.client.Client.getAccessToken(Client.java:725) at com.esri.geoportal.commons.gpt.client.Client.queryIds(Client.java:544) at com.esri.geoportal.commons.gpt.client.Client.publish(Client.java:256) at com.esri.geoportal.harvester.gpt.GptBroker.publish(GptBroker.java:199) ... 7 more 05-Aug-2020 14:11:16.747 INFO [HARVESTING] com.esri.geoportal.harvester.support.ReportLogger.completed Completed processing task: PROCESS:: status: completed, title: NAME: 04-08, PROCESSOR: DEFAULT[], SOURCE: AGP-IN[agp-host-url=https://www.arcgis.com, agp-folder-id=b63d2f2d73644d00a9aa302dff4c913a, cred-username=Durga7, cred-password=, agp-emit-xml=true, agp-emit-xml-fmt=DEFAULT, agp-emit-json=true, agp-max-redirects=5], DESTINATIONS: [GPT[gpt-host-url=https://49.207.9.178:8080/geoportal, cred-username=publisher, cred-password=, gpt-index=, gpt-cleanup=false, gpt-accept-xml=true, gpt-accept-json=true, gpt-translate-pdf=true]], INCREMENTAL: false, IGNOREROBOTSTXT: true 05-Aug-2020 14:11:16.747 INFO [HARVESTING] com.esri.geoportal.harvester.support.ReportStatistics.completed Harvesting of PROCESS:: status: completed, title: NAME: 04-08, PROCESSOR: DEFAULT[], SOURCE: AGP-IN[agp-host-url=https://www.arcgis.com, agp-folder-id=b63d2f2d73644d00a9aa302dff4c913a, cred-username=Durga7, cred-password=, agp-emit-xml=true, agp-emit-xml-fmt=DEFAULT, agp-emit-json=true, agp-max-redirects=5], DESTINATIONS: [GPT[gpt-host-url=https://49.207.9.178:8080/geoportal, cred-username=publisher, cred-password=, gpt-index=, gpt-cleanup=false, gpt-accept-xml=true, gpt-accept-json=true, gpt-translate-pdf=true]], INCREMENTAL: false, IGNOREROBOTSTXT: true completed at Wed Aug 05 14:11:16 IST 2020. No. succeded: 0, no. failed: 1

mhogeweg commented 3 years ago

your geoportal URL is not correct. you provide port 8080 in combination with https protocol. use http://49.207.9.178:8080/geoportal instead

Durga07 commented 3 years ago

Yes, the files are harvested to Geoportal. But why can't I add this in the Map viewer Capture6(Add AO)

The XML file : http://49.207.9.178:8080/geoportal/rest/metadata/item/02ff69fc5afa473c8e50e7a72184cc2e/xml

mhogeweg commented 3 years ago

that is because the actual resource referenced in the xml is the portal map viewer application. Geoportal can currently only add web services to a map. It looks like your map included the dark grey canvas basemap and the transportation world map service on top of that. if you register that map service separately in your portal, then harvest into geoportal, you should be able to add it to the map.