Closed hxpGit512 closed 1 year ago
If this requirement is mandatory, does it mean that both ref and predict datasets have to be aligned var_names first before each prediction of a new dataset?
Thank you for your interest in TOSICA. In version 1.0, it is necessary for TOSICA to have identical var_names
for both reference and query in order to fit the masked embedding built during the training step. However, this does not mean that you need to constantly align your reference and new dataset. Instead, you can use the var_names
in your reference to select and arrange the query. If certain var_names
are not present in the query, you can simply treat it as a dropout and fill it with 0 as input. A dropout rate of less than 20% is acceptable, otherwise realigning and retraining become necessary.
已收件
已收件