RUCAIBox / RecBole

A unified, comprehensive and efficient recommendation library
https://recbole.io/
MIT License
3.37k stars 606 forks source link

[๐Ÿ›BUG]ray.tune่‡ชๅŠจ่ฐƒๅ‚้—ฎ้ข˜ #1990

Open Gabrielle240125 opened 8 months ago

Gabrielle240125 commented 8 months ago

ๆ่ฟฐ่ฟ™ไธช bug ray.tune่‡ชๅŠจ่ฐƒๅ‚็š„่ฟ‡็จ‹ไธญๆŠฅ้”™ KeyError: 'model need to be specified in at least one of the these ways: [model variable, config file, config dict, command line] '

ๅฆ‚ไฝ•ๅค็Žฐ ๅค็Žฐ่ฟ™ไธช bug ็š„ๆญฅ้ชค๏ผš

  1. ๆ‚จๅผ•ๅ…ฅ็š„้ขๅค– yaml ๆ–‡ไปถ

    dataset config

    field_separator: "\t" #ๆŒ‡ๅฎšๆ•ฐๆฎ้›†field็š„ๅˆ†้š”็ฌฆ seq_separator: " " #ๆŒ‡ๅฎšๆ•ฐๆฎ้›†ไธญtoken_seqๆˆ–่€…float_seqๅŸŸ้‡Œ็š„ๅˆ†้š”็ฌฆ USER_ID_FIELD: user_id #ๆŒ‡ๅฎš็”จๆˆทidๅŸŸ ITEM_ID_FIELD: item_id #ๆŒ‡ๅฎš็‰ฉๅ“idๅŸŸ RATING_FIELD: rating #ๆŒ‡ๅฎšๆ‰“ๅˆ†ratingๅŸŸ-ไบŒๅˆ†ๆณ•ๆ˜ฏๅฆ่ดญไนฐ TIME_FIELD: time #ๆŒ‡ๅฎšๆ—ถ้—ดๅŸŸ NEGPREFIX: neg #ๆŒ‡ๅฎš่ดŸ้‡‡ๆ ทๅ‰็ผ€ LABEL_FIELD: type #ๆŒ‡ๅฎšๆ ‡็ญพๅŸŸ ITEM_LIST_LENGTH_FIELD: item_length #ๆŒ‡ๅฎšๅบๅˆ—้•ฟๅบฆๅŸŸ LIST_SUFFIX: _list #ๆŒ‡ๅฎšๅบๅˆ—ๅ‰็ผ€ MAX_ITEM_LIST_LENGTH: 50 #ๆŒ‡ๅฎšๆœ€ๅคงๅบๅˆ—้•ฟๅบฆ POSITION_FIELD: [V29, V11, V25] #ๆŒ‡ๅฎš็”Ÿๆˆ็š„ๅบๅˆ—ไฝ็ฝฎid

    ๆŒ‡ๅฎšไปŽไป€ไนˆๆ–‡ไปถ้‡Œ่ฏปไป€ไนˆๅˆ—๏ผŒ่ฟ™้‡Œๅฐฑๆ˜ฏไปŽ.inter้‡Œ้ข่ฏปๅ–user_id, item_id, type, timestamp, flag่ฟ™ไบ”ๅˆ—,ๅ‰ฉไธ‹็š„ไปฅๆญค็ฑปๆŽจ

    load_col: inter: [user_id, item_id, time, type] user: [user_id, V29, V11, V25] selected_features: [V29, V11, V25]

training settings

epochs: 6 #่ฎญ็ปƒ็š„ๆœ€ๅคง่ฝฎๆ•ฐ

train_batch_size: 32 #่ฎญ็ปƒ็š„batch_size

learner: adam #ไฝฟ็”จ็š„pytorchๅ†…็ฝฎไผ˜ๅŒ–ๅ™จ

learning_rate: 0.001 #ๅญฆไน ็Ž‡

training_neg_sample_args: ~ #่ดŸ้‡‡ๆ ทๆ•ฐ็›ฎ eval_step: 1 #ๆฏๆฌก่ฎญ็ปƒๅŽๅševalaution็š„ๆฌกๆ•ฐ stopping_step: 10 #ๆŽงๅˆถ่ฎญ็ปƒๆ”ถๆ•›็š„ๆญฅ้ชคๆ•ฐ๏ผŒๅœจ่ฏฅๆญฅ้ชคๆ•ฐๅ†…่‹ฅ้€‰ๅ–็š„่ฏ„ๆต‹ๆ ‡ๅ‡†ๆฒกๆœ‰ไป€ไนˆๅ˜ๅŒ–๏ผŒๅฐฑๅฏไปฅๆๅ‰ๅœๆญขไบ†

bertๅ‚ๆ•ฐ

n_layers: 2 # (int) The number of transformer layers in transformer encoder.### n_heads: 4 # (int) The number of attention heads for multi-head attention layer.## hidden_size: 256 # (int) The number of features in the hidden state.### inner_size: 256 # (int) The inner hidden size in feed-forward layer.

hidden_dropout_prob: 0.5 # (float) The probability of an element to be zeroed.

attn_dropout_prob: 0.5 # (float) The probability of an attention score to be zeroed.

hidden_act: 'gelu' # (str) The activation function in feed-forward layer. layer_norm_eps: 1e-12 # (float) A value added to the denominator for numerical stability. initializer_range: 0.02 # (float) The standard deviation for normal initialization. mask_ratio: 0.2 # (float) The probability for a item replaced by MASK token. loss_type: 'CE' # (str) The type of loss function. transform: mask_itemseq # (str) The transform operation for batch data process. ft_ratio: 0.5 # (float) The probability of generating fine-tuning samples

evalution settings

eval_setting: TO_LS,full #ๅฏนๆ•ฐๆฎๆŒ‰ๆ—ถ้—ดๆŽ’ๅบ๏ผŒ่ฎพ็ฝฎ็•™ไธ€ๆณ•ๅˆ’ๅˆ†ๆ•ฐๆฎ้›†๏ผŒๅนถไฝฟ็”จๅ…จๆŽ’ๅบ eval_args: split: {'LS': 'valid_and_test'} #ๅˆ‡ๅˆ†ๆฏ”ไพ‹ mode: full order: TO metrics: ["Recall", "MRR","NDCG","Hit","Precision"] #่ฏ„ๆต‹ๆ ‡ๅ‡† topk: [1,5,10] #่ฏ„ๆต‹ๆ ‡ๅ‡†ไฝฟ็”จtopk๏ผŒ่ฎพ็ฝฎๆˆ10่ฏ„ๆต‹ๆ ‡ๅ‡†ๅฐฑๆ˜ฏ["Recall@10", "MRR@10", "NDCG@10", "Hit@10", "Precision@10"]

valid_metric: MRR@10 #้€‰ๅ–ๅ“ชไธช่ฏ„ๆต‹ๆ ‡ๅ‡†ไฝœไธบไฝœไธบๆๅ‰ๅœๆญข่ฎญ็ปƒ็š„ๆ ‡ๅ‡†

eval_batch_size: 256 #่ฏ„ๆต‹็š„batch_size

2.testๆ–‡ไปถ learning_rate choice [0.001,0.0001,0.00001] epochs choice [3,4,5,6,7,8] train_batch_size choice [16,32,64,128] hidden_dropout_prob choice [0.2,0.3,0.4,0.5,0.6,0.7,0.8] attn_dropout_prob choice [0.2,0.3,0.4,0.5,0.6,0.7,0.8]

  1. ๆ‚จ็š„ไปฃ็  python run_hyper.py --model=BERTRec --dataset=use --config_files=bert_test.yaml --params_file=bert_test.test --tool=Ray

  2. ๆ‚จ็š„่ฟ่กŒ่„šๆœฌ 2024-02-05 01:09:58,835 WARNING utils.py:575 -- Detecting docker specified CPUs. In previous versions of Ray, CPU detection in containers was incorrect. Please ensure that Ray has enough CPUs allocated. As a temporary workaround to revert to the prior behavior, set RAY_USE_MULTIPROCESSING_CPU_COUNT=1 as an env var before starting Ray. Set the env var: RAY_DISABLE_DOCKER_CPU_WARNING=1 to mute this warning. 2024-02-05 01:09:59,011 INFO worker.py:1724 -- Started a local Ray instance. 2024-02-05 01:09:59,567 INFO tune.py:592 -- [output] This will use the new output engine with verbosity 2. To disable the new output and use the legacy output engine, set the environment variable RAY_AIR_NEW_OUTPUT=0. For more information, please see https://github.com/ray-project/ray/issues/36949 โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ โ”‚ Configuration for experiment objective_function_2024-02-05_01-09-59 โ”‚ โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค โ”‚ Search algorithm BasicVariantGenerator โ”‚ โ”‚ Scheduler AsyncHyperBandScheduler โ”‚ โ”‚ Number of trials 5 โ”‚ โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

View detailed results here: /root/autodl-tmp/zzzzzz/RecBole-1.2.0/ray_log/objective_function_2024-02-05_01-09-59 To visualize your results with TensorBoard, run: tensorboard --logdir /root/autodl-tmp/zzzzzz/RecBole-1.2.0/ray_log/objective_function_2024-02-05_01-09-59

Trial status: 5 PENDING Current time: 2024-02-05 01:09:59. Total running time: 0s Logical resource usage: 0/12 CPUs, 0/1 GPUs (0.0/1.0 accelerator_type:G) โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ โ”‚ Trial name status learning_rate epochs train_batch_size hidden_dropout_prob attn_dropout_prob โ”‚ โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค โ”‚ objective_function_3a46e_00000 PENDING 1e-05 3 16 0.5 0.2 โ”‚ โ”‚ objective_function_3a46e_00001 PENDING 1e-05 8 32 0.6 0.3 โ”‚ โ”‚ objective_function_3a46e_00002 PENDING 1e-05 3 128 0.7 0.3 โ”‚ โ”‚ objective_function_3a46e_00003 PENDING 0.001 6 32 0.8 0.4 โ”‚ โ”‚ objective_function_3a46e_00004 PENDING 1e-05 5 16 0.8 0.3 โ”‚ โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

Trial objective_function_3a46e_00000 started with configuration: โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ โ”‚ Trial objective_function_3a46e_00000 config โ”‚ โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค โ”‚ attn_dropout_prob 0.2 โ”‚ โ”‚ epochs 3 โ”‚ โ”‚ hidden_dropout_prob 0.5 โ”‚ โ”‚ learning_rate 1e-05 โ”‚ โ”‚ train_batch_size 16 โ”‚ โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ 2024-02-05 01:10:03,575 ERROR tune_controller.py:1374 -- Trial task failed for trial objective_function_3a46e_00000 Traceback (most recent call last): File "/root/miniconda3/lib/python3.8/site-packages/ray/air/execution/_internal/event_manager.py", line 110, in resolve_future result = ray.get(future) File "/root/miniconda3/lib/python3.8/site-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper return fn(*args, kwargs) File "/root/miniconda3/lib/python3.8/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper return func(*args, *kwargs) File "/root/miniconda3/lib/python3.8/site-packages/ray/_private/worker.py", line 2624, in get raise value.as_instanceof_cause() ray.exceptions.RayTaskError(KeyError): ray::ImplicitFunc.train() (pid=5180, ip=172.17.0.10, actor_id=59ef74a8e98fa23d2915878e01000000, repr=objective_function) File "/root/miniconda3/lib/python3.8/site-packages/ray/tune/trainable/trainable.py", line 342, in train raise skipped from exception_cause(skipped) File "/root/miniconda3/lib/python3.8/site-packages/ray/air/_internal/util.py", line 88, in run self._ret = self._target(self._args, self._kwargs) File "/root/miniconda3/lib/python3.8/site-packages/ray/tune/trainable/function_trainable.py", line 115, in training_func=lambda: self._trainable_func(self.config), File "/root/miniconda3/lib/python3.8/site-packages/ray/tune/trainable/function_trainable.py", line 332, in _trainable_func output = fn() File "/root/miniconda3/lib/python3.8/site-packages/ray/tune/trainable/util.py", line 138, in inner return trainable(config, **fn_kwargs) File "/root/autodl-tmp/zzzzzz/RecBole-1.2.0/recbole/quick_start/quick_start.py", line 205, in objective_function config = Config(config_dict=config_dict, config_file_list=config_file_list) File "/root/autodl-tmp/zzzzzz/RecBole-1.2.0/recbole/config/configurator.py", line 88, in init self.model, self.model_class, self.dataset = self._get_model_and_dataset( File "/root/autodl-tmp/zzzzzz/RecBole-1.2.0/recbole/config/configurator.py", line 207, in _get_model_and_dataset raise KeyError( KeyError: 'model need to be specified in at least one of the these ways: [model variable, config file, config dict, command line] '

Trial objective_function_3a46e_00000 errored after 0 iterations at 2024-02-05 01:10:03. Total running time: 3s Error file: /root/autodl-tmp/zzzzzz/RecBole-1.2.0/ray_log/objective_function_2024-02-05_01-09-59/objective_function_3a46e_00000_0_attn_dropout_prob=0.2000,epochs=3,hidden_dropout_prob=0.5000,learning_rate=0.0000,train_batch_siz_2024-02-05_01-09-59/error.txt

Trial objective_function_3a46e_00001 started with configuration: โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ โ”‚ Trial objective_function_3a46e_00001 config โ”‚ โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค โ”‚ attn_dropout_prob 0.3 โ”‚ โ”‚ epochs 8 โ”‚ โ”‚ hidden_dropout_prob 0.6 โ”‚ โ”‚ learning_rate 1e-05 โ”‚ โ”‚ train_batch_size 32 โ”‚ โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ 2024-02-05 01:10:07,907 ERROR tune_controller.py:1374 -- Trial task failed for trial objective_function_3a46e_00001 Traceback (most recent call last): File "/root/miniconda3/lib/python3.8/site-packages/ray/air/execution/_internal/event_manager.py", line 110, in resolve_future result = ray.get(future) File "/root/miniconda3/lib/python3.8/site-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper return fn(*args, kwargs) File "/root/miniconda3/lib/python3.8/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper return func(*args, *kwargs) File "/root/miniconda3/lib/python3.8/site-packages/ray/_private/worker.py", line 2624, in get raise value.as_instanceof_cause() ray.exceptions.RayTaskError(KeyError): ray::ImplicitFunc.train() (pid=5288, ip=172.17.0.10, actor_id=93914fdfe83c79965f2215bd01000000, repr=objective_function) File "/root/miniconda3/lib/python3.8/site-packages/ray/tune/trainable/trainable.py", line 342, in train raise skipped from exception_cause(skipped) File "/root/miniconda3/lib/python3.8/site-packages/ray/air/_internal/util.py", line 88, in run self._ret = self._target(self._args, self._kwargs) File "/root/miniconda3/lib/python3.8/site-packages/ray/tune/trainable/function_trainable.py", line 115, in training_func=lambda: self._trainable_func(self.config), File "/root/miniconda3/lib/python3.8/site-packages/ray/tune/trainable/function_trainable.py", line 332, in _trainable_func output = fn() File "/root/miniconda3/lib/python3.8/site-packages/ray/tune/trainable/util.py", line 138, in inner return trainable(config, **fn_kwargs) File "/root/autodl-tmp/zzzzzz/RecBole-1.2.0/recbole/quick_start/quick_start.py", line 205, in objective_function config = Config(config_dict=config_dict, config_file_list=config_file_list) File "/root/autodl-tmp/zzzzzz/RecBole-1.2.0/recbole/config/configurator.py", line 88, in init self.model, self.model_class, self.dataset = self._get_model_and_dataset( File "/root/autodl-tmp/zzzzzz/RecBole-1.2.0/recbole/config/configurator.py", line 207, in _get_model_and_dataset raise KeyError( KeyError: 'model need to be specified in at least one of the these ways: [model variable, config file, config dict, command line] '

Trial objective_function_3a46e_00001 errored after 0 iterations at 2024-02-05 01:10:07. Total running time: 8s Error file: /root/autodl-tmp/zzzzzz/RecBole-1.2.0/ray_log/objective_function_2024-02-05_01-09-59/objective_function_3a46e_00001_1_attn_dropout_prob=0.3000,epochs=8,hidden_dropout_prob=0.6000,learning_rate=0.0000,train_batch_siz_2024-02-05_01-09-59/error.txt

Trial objective_function_3a46e_00002 started with configuration: โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ โ”‚ Trial objective_function_3a46e_00002 config โ”‚ โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค โ”‚ attn_dropout_prob 0.3 โ”‚ โ”‚ epochs 3 โ”‚ โ”‚ hidden_dropout_prob 0.7 โ”‚ โ”‚ learning_rate 1e-05 โ”‚ โ”‚ train_batch_size 128 โ”‚ โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ 2024-02-05 01:10:11,884 ERROR tune_controller.py:1374 -- Trial task failed for trial objective_function_3a46e_00002 Traceback (most recent call last): File "/root/miniconda3/lib/python3.8/site-packages/ray/air/execution/_internal/event_manager.py", line 110, in resolve_future result = ray.get(future) File "/root/miniconda3/lib/python3.8/site-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper return fn(*args, kwargs) File "/root/miniconda3/lib/python3.8/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper return func(*args, *kwargs) File "/root/miniconda3/lib/python3.8/site-packages/ray/_private/worker.py", line 2624, in get raise value.as_instanceof_cause() ray.exceptions.RayTaskError(KeyError): ray::ImplicitFunc.train() (pid=5407, ip=172.17.0.10, actor_id=5ddebdbf0d35ad2b970d4e7a01000000, repr=objective_function) File "/root/miniconda3/lib/python3.8/site-packages/ray/tune/trainable/trainable.py", line 342, in train raise skipped from exception_cause(skipped) File "/root/miniconda3/lib/python3.8/site-packages/ray/air/_internal/util.py", line 88, in run self._ret = self._target(self._args, self._kwargs) File "/root/miniconda3/lib/python3.8/site-packages/ray/tune/trainable/function_trainable.py", line 115, in training_func=lambda: self._trainable_func(self.config), File "/root/miniconda3/lib/python3.8/site-packages/ray/tune/trainable/function_trainable.py", line 332, in _trainable_func output = fn() File "/root/miniconda3/lib/python3.8/site-packages/ray/tune/trainable/util.py", line 138, in inner return trainable(config, **fn_kwargs) File "/root/autodl-tmp/zzzzzz/RecBole-1.2.0/recbole/quick_start/quick_start.py", line 205, in objective_function config = Config(config_dict=config_dict, config_file_list=config_file_list) File "/root/autodl-tmp/zzzzzz/RecBole-1.2.0/recbole/config/configurator.py", line 88, in init self.model, self.model_class, self.dataset = self._get_model_and_dataset( File "/root/autodl-tmp/zzzzzz/RecBole-1.2.0/recbole/config/configurator.py", line 207, in _get_model_and_dataset raise KeyError( KeyError: 'model need to be specified in at least one of the these ways: [model variable, config file, config dict, command line] '

Trial objective_function_3a46e_00002 errored after 0 iterations at 2024-02-05 01:10:11. Total running time: 12s Error file: /root/autodl-tmp/zzzzzz/RecBole-1.2.0/ray_log/objective_function_2024-02-05_01-09-59/objective_function_3a46e_00002_2_attn_dropout_prob=0.3000,epochs=3,hidden_dropout_prob=0.7000,learning_rate=0.0000,train_batch_siz_2024-02-05_01-09-59/error.txt

Trial objective_function_3a46e_00003 started with configuration: โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ โ”‚ Trial objective_function_3a46e_00003 config โ”‚ โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค โ”‚ attn_dropout_prob 0.4 โ”‚ โ”‚ epochs 6 โ”‚ โ”‚ hidden_dropout_prob 0.8 โ”‚ โ”‚ learning_rate 0.001 โ”‚ โ”‚ train_batch_size 32 โ”‚ โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ 2024-02-05 01:10:15,897 ERROR tune_controller.py:1374 -- Trial task failed for trial objective_function_3a46e_00003 Traceback (most recent call last): File "/root/miniconda3/lib/python3.8/site-packages/ray/air/execution/_internal/event_manager.py", line 110, in resolve_future result = ray.get(future) File "/root/miniconda3/lib/python3.8/site-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper return fn(*args, kwargs) File "/root/miniconda3/lib/python3.8/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper return func(*args, *kwargs) File "/root/miniconda3/lib/python3.8/site-packages/ray/_private/worker.py", line 2624, in get raise value.as_instanceof_cause() ray.exceptions.RayTaskError(KeyError): ray::ImplicitFunc.train() (pid=5511, ip=172.17.0.10, actor_id=87da750c3700a373057fafee01000000, repr=objective_function) File "/root/miniconda3/lib/python3.8/site-packages/ray/tune/trainable/trainable.py", line 342, in train raise skipped from exception_cause(skipped) File "/root/miniconda3/lib/python3.8/site-packages/ray/air/_internal/util.py", line 88, in run self._ret = self._target(self._args, self._kwargs) File "/root/miniconda3/lib/python3.8/site-packages/ray/tune/trainable/function_trainable.py", line 115, in training_func=lambda: self._trainable_func(self.config), File "/root/miniconda3/lib/python3.8/site-packages/ray/tune/trainable/function_trainable.py", line 332, in _trainable_func output = fn() File "/root/miniconda3/lib/python3.8/site-packages/ray/tune/trainable/util.py", line 138, in inner return trainable(config, **fn_kwargs) File "/root/autodl-tmp/zzzzzz/RecBole-1.2.0/recbole/quick_start/quick_start.py", line 205, in objective_function config = Config(config_dict=config_dict, config_file_list=config_file_list) File "/root/autodl-tmp/zzzzzz/RecBole-1.2.0/recbole/config/configurator.py", line 88, in init self.model, self.model_class, self.dataset = self._get_model_and_dataset( File "/root/autodl-tmp/zzzzzz/RecBole-1.2.0/recbole/config/configurator.py", line 207, in _get_model_and_dataset raise KeyError( KeyError: 'model need to be specified in at least one of the these ways: [model variable, config file, config dict, command line] '

Trial objective_function_3a46e_00003 errored after 0 iterations at 2024-02-05 01:10:15. Total running time: 16s Error file: /root/autodl-tmp/zzzzzz/RecBole-1.2.0/ray_log/objective_function_2024-02-05_01-09-59/objective_function_3a46e_00003_3_attn_dropout_prob=0.4000,epochs=6,hidden_dropout_prob=0.8000,learning_rate=0.0010,train_batch_siz_2024-02-05_01-09-59/error.txt

Trial objective_function_3a46e_00004 started with configuration: โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ โ”‚ Trial objective_function_3a46e_00004 config โ”‚ โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค โ”‚ attn_dropout_prob 0.3 โ”‚ โ”‚ epochs 5 โ”‚ โ”‚ hidden_dropout_prob 0.8 โ”‚ โ”‚ learning_rate 1e-05 โ”‚ โ”‚ train_batch_size 16 โ”‚ โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ 2024-02-05 01:10:19,944 ERROR tune_controller.py:1374 -- Trial task failed for trial objective_function_3a46e_00004 Traceback (most recent call last): File "/root/miniconda3/lib/python3.8/site-packages/ray/air/execution/_internal/event_manager.py", line 110, in resolve_future result = ray.get(future) File "/root/miniconda3/lib/python3.8/site-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper return fn(*args, kwargs) File "/root/miniconda3/lib/python3.8/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper return func(*args, *kwargs) File "/root/miniconda3/lib/python3.8/site-packages/ray/_private/worker.py", line 2624, in get raise value.as_instanceof_cause() ray.exceptions.RayTaskError(KeyError): ray::ImplicitFunc.train() (pid=5615, ip=172.17.0.10, actor_id=6e5617af8c0396db6af5aa0101000000, repr=objective_function) File "/root/miniconda3/lib/python3.8/site-packages/ray/tune/trainable/trainable.py", line 342, in train raise skipped from exception_cause(skipped) File "/root/miniconda3/lib/python3.8/site-packages/ray/air/_internal/util.py", line 88, in run self._ret = self._target(self._args, self._kwargs) File "/root/miniconda3/lib/python3.8/site-packages/ray/tune/trainable/function_trainable.py", line 115, in training_func=lambda: self._trainable_func(self.config), File "/root/miniconda3/lib/python3.8/site-packages/ray/tune/trainable/function_trainable.py", line 332, in _trainable_func output = fn() File "/root/miniconda3/lib/python3.8/site-packages/ray/tune/trainable/util.py", line 138, in inner return trainable(config, **fn_kwargs) File "/root/autodl-tmp/zzzzzz/RecBole-1.2.0/recbole/quick_start/quick_start.py", line 205, in objective_function config = Config(config_dict=config_dict, config_file_list=config_file_list) File "/root/autodl-tmp/zzzzzz/RecBole-1.2.0/recbole/config/configurator.py", line 88, in init self.model, self.model_class, self.dataset = self._get_model_and_dataset( File "/root/autodl-tmp/zzzzzz/RecBole-1.2.0/recbole/config/configurator.py", line 207, in _get_model_and_dataset raise KeyError( KeyError: 'model need to be specified in at least one of the these ways: [model variable, config file, config dict, command line] '

Trial objective_function_3a46e_00004 errored after 0 iterations at 2024-02-05 01:10:19. Total running time: 20s Error file: /root/autodl-tmp/zzzzzz/RecBole-1.2.0/ray_log/objective_function_2024-02-05_01-09-59/objective_function_3a46e_00004_4_attn_dropout_prob=0.3000,epochs=5,hidden_dropout_prob=0.8000,learning_rate=0.0000,train_batch_siz_2024-02-05_01-09-59/error.txt

Trial status: 5 ERROR Current time: 2024-02-05 01:10:19. Total running time: 20s Logical resource usage: 0/12 CPUs, 1.0/1 GPUs (0.0/1.0 accelerator_type:G) โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ โ”‚ Trial name status learning_rate epochs train_batch_size hidden_dropout_prob attn_dropout_prob โ”‚ โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค โ”‚ objective_function_3a46e_00000 ERROR 1e-05 3 16 0.5 0.2 โ”‚ โ”‚ objective_function_3a46e_00001 ERROR 1e-05 8 32 0.6 0.3 โ”‚ โ”‚ objective_function_3a46e_00002 ERROR 1e-05 3 128 0.7 0.3 โ”‚ โ”‚ objective_function_3a46e_00003 ERROR 0.001 6 32 0.8 0.4 โ”‚ โ”‚ objective_function_3a46e_00004 ERROR 1e-05 5 16 0.8 0.3 โ”‚ โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

Number of errored trials: 5 โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ โ”‚ Trial name # failures error file โ”‚ โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค โ”‚ objective_function_3a46e_00000 1 /root/autodl-tmp/zzzzzz/RecBole-1.2.0/ray_log/objective_function_2024-02-05_01-09-59/objective_function_3a46e_00000_0_attn_dropout_prob=0.2000,epochs=3,hidden_dropout_prob=0.5000,learning_rate=0.0000,train_batch_siz_2024-02-05_01-09-59/error.txt โ”‚ โ”‚ objective_function_3a46e_00001 1 /root/autodl-tmp/zzzzzz/RecBole-1.2.0/ray_log/objective_function_2024-02-05_01-09-59/objective_function_3a46e_00001_1_attn_dropout_prob=0.3000,epochs=8,hidden_dropout_prob=0.6000,learning_rate=0.0000,train_batch_siz_2024-02-05_01-09-59/error.txt โ”‚ โ”‚ objective_function_3a46e_00002 1 /root/autodl-tmp/zzzzzz/RecBole-1.2.0/ray_log/objective_function_2024-02-05_01-09-59/objective_function_3a46e_00002_2_attn_dropout_prob=0.3000,epochs=3,hidden_dropout_prob=0.7000,learning_rate=0.0000,train_batch_siz_2024-02-05_01-09-59/error.txt โ”‚ โ”‚ objective_function_3a46e_00003 1 /root/autodl-tmp/zzzzzz/RecBole-1.2.0/ray_log/objective_function_2024-02-05_01-09-59/objective_function_3a46e_00003_3_attn_dropout_prob=0.4000,epochs=6,hidden_dropout_prob=0.8000,learning_rate=0.0010,train_batch_siz_2024-02-05_01-09-59/error.txt โ”‚ โ”‚ objective_function_3a46e_00004 1 /root/autodl-tmp/zzzzzz/RecBole-1.2.0/ray_log/objective_function_2024-02-05_01-09-59/objective_function_3a46e_00004_4_attn_dropout_prob=0.3000,epochs=5,hidden_dropout_prob=0.8000,learning_rate=0.0000,train_batch_siz_2024-02-05_01-09-59/error.txt โ”‚ โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

Traceback (most recent call last): File "run_hyper.py", line 127, in ray_tune(args) File "run_hyper.py", line 94, in ray_tune result = tune.run( File "/root/miniconda3/lib/python3.8/site-packages/ray/tune/tune.py", line 1036, in run raise TuneError("Trials did not complete", incomplete_trials) ray.tune.error.TuneError: ('Trials did not complete', [objective_function_3a46e_00000, objective_function_3a46e_00001, objective_function_3a46e_00002, objective_function_3a46e_00003, objective_function_3a46e_00004])

้ข„ๆœŸ ๅฆ‚ไฝ•่งฃๅ†ณ้—ฎ้ข˜ไฝฟๅพ—ray้กบๅˆฉ่ฟ่กŒๅ‘ข๏ผŸ

ๅฑๅน•ๆˆชๅ›พ ๆทปๅŠ ๅฑๅน•ๆˆชๅ›พไปฅๅธฎๅŠฉ่งฃ้‡Šๆ‚จ็š„้—ฎ้ข˜ใ€‚๏ผˆๅฏ้€‰๏ผ‰

้“พๆŽฅ ๆทปๅŠ ่ƒฝๅคŸๅค็Žฐ bug ็š„ไปฃ็ ้“พๆŽฅ๏ผŒๅฆ‚ Colab ๆˆ–่€…ๅ…ถไป–ๅœจ็บฟ Jupyter ๅนณๅฐใ€‚๏ผˆๅฏ้€‰๏ผ‰

ๅฎž้ชŒ็Žฏๅขƒ๏ผˆ่ฏท่กฅๅ…จไธ‹ๅˆ—ไฟกๆฏ๏ผ‰๏ผš

Gabrielle240125 commented 7 months ago

ๅ‚ๆ•ฐ้€š่ฟ‡config_filesไผ ่พ“ๆ‰่ƒฝ่ขซray็š„function่ฏ†ๅˆซ๏ผŒๆ‰€ไปฅๅœจ.yamlๆ–‡ไปถ้‡Œๅ†™ๅ…ฅๆจกๅž‹ๅ’Œๆ•ฐๆฎ้›† model: BERT4Rec dataset: use tool: Ray