We've had problems in the past when taking tree seqs that are already inferred then trying to infer them again, because if the md schema name already exists, we bomb out, here:
I suggest that we add a description to the md schema field (e.g. for "sample_data_id", "sample_data_time" etc.) and if we detect that the description also matches in the existing metadata we then overwrite with a warning. The justification is that if the description matches (which presumably contains the word "tsinfer") then we are simply stomping on our own data.
We've had problems in the past when taking tree seqs that are already inferred then trying to infer them again, because if the md schema name already exists, we bomb out, here:
https://github.com/tskit-dev/tsinfer/blob/1d45c0c8122d0680cee2fcfd4b23c7dbbcb7a497/tsinfer/inference.py#L118
I suggest that we add a description to the md schema field (e.g. for "sample_data_id", "sample_data_time" etc.) and if we detect that the description also matches in the existing metadata we then overwrite with a warning. The justification is that if the description matches (which presumably contains the word "tsinfer") then we are simply stomping on our own data.