Closed cessda-bitbucket-importer closed 4 years ago
Original comment by John Shepherdson (GitHub: john-shepherdson).
Need to do more work to understand how to use Kibana
Original comment by John Shepherdson (GitHub: john-shepherdson).
Check for presence of required fields, consistence of field values etc. Report issues back to SPs (via Metadata Office).
Original comment by John Shepherdson (GitHub: john-shepherdson).
Any obvious deficiencies are being reported to the Metadata Office via the issue tracker (https://github.com/cessda/cessda.metadata.officeissues)
Original comment by John Shepherdson (GitHub: john-shepherdson).
Better logging. Source, field, problem. Compatibility with Graylog. Use CDC DDI profile as pre-check?
Original comment by John Shepherdson (GitHub: john-shepherdson).
Improve logging output Estimate of effort required to diagnose and fix: 1 day (CONTRACTOR)
Original comment by Moses Mansaray (GitHub: doraVentures).
@john-shepherdson your first comment on this ticket is:
Need to do more work to understand how to use Kibana
However, I see you now mentioned Graylog as a similar tool to Kibana a log aggregator/dashboard. Is it correct to say Graylog supersedes Kibana? Note I have not used Graylog before so I would need to do some research on it to meet the compatibility you mentioned here. But I will expected every sort of log aggregator/service to handle pretty much most logs and have custom tools on it’s platform to tell it how to interpret logs that it cannot automatically interpret rather than having knowledge of Graylog in ones application (Vendor lock!).
Better logging. Source, field, problem. Compatibility with Graylog. Use CDC DDI profile as pre-check?
Check for presence of required fields, consistence of field values etc. Report issues back to SPs (via Metadata Office).
I’m interpreting the above two comments as:
You want me to log every study document that is is being passed by the Indexer (this would include all fields)
Run the document against a custom JsonSchema or logic to check for “Check for presence of required fields” and log errors found
“consistence of field values” this is the job of a log aggregator/dashboard and cannot be done by the Indexer
Note
DEBUG
mode.
Please confirm my understanding of this ticket above is correct.
Original comment by Moses Mansaray (GitHub: doraVentures).
@john-shepherdson Awaits your response. Also if we are to use the proposed external study validator service we discussed this work would be redundant.
Assigning to you.
Original comment by John Shepherdson (GitHub: john-shepherdson).
Graylog is now being used instead of Kibana, but to so extent that is irrelevant here.
Agree that there are potential performance issues, so may need to make this modal.
When I look at Springboot logs and see error messages, it can be difficult to know which endpoint and/or which record is causing the error. Some examples from https://datacatalogue-dev.cessda.eu/admin/#/applications/e3783374/logfile which may not be covered by external validator:
2019-12-23 16:19:22.792 WARN DefaultHarvesterConsumerService.java:87) - Exception msg[Unsuccessful response from remote repository.]. External system response body[{"message":"InternalSystemException: Unable to parse xml :Error on line 19033: Attribute name \"w\" associated with an element type \"location\" must be followed by the ' = ' character."}]
2019-12-23 16:22:31.746 ERROR (LogHelper.java:49) - RemoteResponse(logLevel=ERROR, responseCode=406, responseMessage=Not Acceptable, occurredAt=2019-12-23T16:22:31.746242)
2019-12-23 16:41:36.346 WARN DefaultHarvesterConsumerService.java:87) - Exception msg[Unsuccessful response from remote repository.]. External system response body[{"message":"InternalSystemException: Unable to parse xml :Error on line 3213: The processing instruction target matching \"[xX][mM][lL]\" is not allowed."}]
Original comment by Moses Mansaray (GitHub: doraVentures).
Agree. I will
Original comment by Moses Mansaray (GitHub: doraVentures).
Improved logs see PRs here @john-shepherdson
Original comment by Moses Mansaray (GitHub: doraVentures).
Assigning to you @john-shepherdson Please review logs on a full re-index and feedback if you need more or less verbose logs information.
Original comment by John Shepherdson (GitHub: john-shepherdson).
First cut inspection of incremental reharvest shown more info re source of error (endpoint - SP and type).
I need to check that records numbers are also present when parsing errors occur .
Original comment by Moses Mansaray (GitHub: doraVentures).
Agree. By the way are you running the apps locally or you have future deployments working as the PRs here are still to be merged to master branch. Though I have updated some of the loggings alongside other tickets. The changes on these PRs significantly improves details of said Service Provide endpoint urls and study Identifiers involved in errors.
Original comment by Moses Mansaray (GitHub: doraVentures).
@john-shepherdson
I’ve reworked the tests to increase coverage around logging activities and merged all branches to master for a faster feedback. Feel free to re-run a full re-ingestion
Original comment by John Shepherdson (GitHub: john-shepherdson).
Please include endpoint details in following messages:
pasc-osmh-handler-oai-pmh:
2019-12-30 11:56:34.017 ERROR (getDocument) (ListRecordHeadersServiceImpl.java:193) - Unable to parse repo RecordHeader response bytes.
pasc-osmh-handler-nesstar:
2019-12-30 12:04:00.844 ERROR (ListRecordHeadersServiceImpl.java:105) - Unable to parse repo RecordHeader response bytes.
Original comment by Moses Mansaray (GitHub: doraVentures).
Done
PRs that adds in fullListRecordUrlPath logging and modifications to allow so.
@john-shepherdson I’ll be merging this now next so I know full state of play of sonar before ending the day and this phase of iteration fixes and improvements.
Original comment by Moses Mansaray (GitHub: doraVentures).
Please include endpoint details in following messages:
Fix verified and working on DEV
handler-oai-pmh
handler Nesstar
To be helpful here is the actual response page if I try to manually access that url.
Original report on BitBucket by John Shepherdson (GitHub: john-shepherdson).
Check for presence of required fields, consistence of field values etc. Report issues back to SPs (via Metadata Office).