Closed nichtich closed 8 months ago
It is testable. It is enough to run on a smaller dataset (e.g. 100K records). It takes minutes, not 2 days.
The visible timestamp in the client helps to tell whether/when the analysis has been finished. Current setup with 73 million PICA records took 2 to 2:30 hours for each of completeness, classifications, and authorities.
Sorry, I do not understand this comment. The client = the web UI? In the web UI there is a visible timestamp. What is missing is an explicit "status" information, such as started at <date time>
or finished at <date time>
. We can also add the duration of the process, such as finished at <date time> took <duration>
.
When I mentioned that it takes minutes (if you run the tool on 100K records) I mean everything including index. Completeness etc. take some hours, but the indexing takes much longer.
Sorry for confusion. This has been done with https://github.com/pkiraly/qa-catalogue/commit/6812aed4382d6ea87b5fc9abb6ce656328f0c296
It looks like analysis parameter files such as
completeness.params.json
are written at the start of analysis (viaprocessor.beforeIteration
) . This results in start time shown viaanalysisTimestamp
in the client when the end time of analysis should be shown instead.