Closed XiaoLongtaoo closed 3 months ago
It looks like a configuration error. Could you post your configuration file?
这是我的数据文件配置:
tiny_seq:
data_root: ../../data/
data_format: npz
train_data: ../../data/tiny_seq/train.npz
valid_data: ../../data/tiny_seq/valid.npz
test_data: ../../data/tiny_seq/test.npz
模型配置文件:
Base:
model_root: './checkpoints/'
num_workers: 3
verbose: 1
early_stop_patience: 2
pickle_feature_encoder: True
save_best_only: True
eval_steps: null
debug_mode: False
group_id: null
use_features: null
feature_specs: null
feature_config: null
TransAct_default: # This is a config template
model: TransAct
dataset_id: TBD
loss: 'binary_crossentropy'
metrics: ['logloss', 'AUC']
task: binary_classification
optimizer: adam
learning_rate: 1.0e-3
embedding_regularizer: 0
net_regularizer: 0
batch_size: 10000
embedding_dim: 64
hidden_activations: relu
dcn_cross_layers: 3
dcn_hidden_units: [1024, 512, 256]
mlp_hidden_units: []
num_heads: 1
transformer_layers: 1
transformer_dropout: 0
dim_feedforward: 512
net_dropout: 0
target_item_field: adgroup_id
sequence_item_field: click_sequence
first_k_cols: 1
use_time_window_mask: False
time_window_ms: 86400000
concat_max_pool: True
batch_norm: False
epochs: 100
shuffle: True
seed: 20242025
monitor: {'AUC': 1, 'logloss': -1}
monitor_mode: 'max'
TransAct_test:
model: TransAct
dataset_id: tiny_seq
loss: 'binary_crossentropy'
metrics: ['logloss', 'AUC']
task: binary_classification
optimizer: adam
learning_rate: 1.0e-3
embedding_regularizer: 0
net_regularizer: 0
batch_size: 128
embedding_dim: 4
hidden_activations: relu
dcn_cross_layers: 3
dcn_hidden_units: [64, 32]
mlp_hidden_units: []
num_heads: 1
transformer_layers: 1
transformer_dropout: 0
dim_feedforward: 512
net_dropout: 0
target_item_field: adgroup_id
sequence_item_field: click_sequence
first_k_cols: 1
use_time_window_mask: False
time_window_ms: 86400000
concat_max_pool: True
batch_norm: False
epochs: 1
shuffle: False
seed: 20242025
monitor: {'AUC': 1, 'logloss': -1}
monitor_mode: 'max'
你直接跑的test 有问题?
Thanks for reporting the issue. It occurs due to the whole sequence is masked due to the lack of behavior data in this demo data. You can fix it by modifing https://github.com/reczoo/FuxiCTR/blob/main/model_zoo/TransAct/src/TransAct.py#L223
key_padding_mask = self.adjust_mask(mask) # ensure that not all tokens are masked
具体的问题代码定位在这里
对于训练阶段没有出现任何问题,但是在评估阶段对于验证数据经过这个embedding_layer后全变成了NaN,经过验证embedding_layer的所有权重全都是NaN,请问这里是哪里出现了问题呢(验证数据也有效加载进来了,X这里对于验证集是正常的)?