VirtualFlyBrain / vfb-pipeline-dumps

Pipeline that creates dumps from the triplestore for consumption by the downstream services
Apache License 2.0
0 stars 0 forks source link

Add timer logging to dumps #32 #33

Closed hkir-dev closed 2 years ago

hkir-dev commented 2 years ago

Fix issue #32 Logging the start and end times of each rule to a log file. Can be improved to log the execution duration. It is a small change (make every rule a single command) but it will reduce the readability I think.

dosumis commented 2 years ago

Can we just output the stage (as no indication which step) and the total time taken?

$@ = target name, so this will record each step being timed (think that's what you're asking)

hkir-dev commented 2 years ago

Expected log will be something like:

remove_embargoed_data started: 1651061473
remove_embargoed_data ended: 1651061478
/out/dumps/owlery.owl started: 1651061478
/out/dumps/owlery.owl ended: 1651061479
...

Instead of start and end time we can calculate the diff and log the diff, but in the makefile each line of a rule runs in a separate shell instance. To store start and end times I will need to convert all rules to \ joined single shell commands. Like:

remove_embargoed_data: $(SPARQL_DIR)/delete_*.sparql
    d=$$(date +%s);\
    $(foreach f,$^,curl -X POST -H "Content-Type:application/x-www-form-urlencoded" -d "update=`cat $(f)`" $(SPARQL_ENDPOINT)/statements) \
    && echo "$$@ took $$(($$(date +%s)-d)) seconds"

But I think it will be a little ugly and hard to maintain (specially in multi-line commands).

dosumis commented 2 years ago

Expected log will be something like:

remove_embargoed_data started: 1651061473
remove_embargoed_data ended: 1651061478
/out/dumps/owlery.owl started: 1651061478
/out/dumps/owlery.owl ended: 1651061479
...

I think that's fine and any improvements should be on a new PR.

I guess it is harder to do this elegantly in Makefiles than in a shell script. One possibility might be to use a make var set at the start and compare all timings back to the start time (with some nicer formatting of time).