HarikalarKutusu / cv-tbox-dataset-compiler

GNU Affero General Public License v3.0
0 stars 0 forks source link

[BUG] Clean cached data when doing forced calculation #40

Open HarikalarKutusu opened 1 month ago

HarikalarKutusu commented 1 month ago

This happened in TC but can be anywhere.... We cache the calculated data and and summaries to not re-calculate, we skip them and add to them only the newly processed ones. But when we do a forced re-calculation (e.g. new measures added), as these are not pre-cleaned, all data can become duplicated in these.

A meaningful solution would be to delete the summary files (e.g. $tc_stats.tsv) beforehand. But we should take care of DEBUG mode and/or future per language/version re-calculations...