baidu-research / tripmaster

Apache License 2.0
2 stars 0 forks source link

Cannot use tripmaster TMSuperviseLearner to go through complete learning pipeline #3

Open YvonneYang1234 opened 1 year ago

YvonneYang1234 commented 1 year ago

I cannot use tripmaster TMSuperviseLearner to go through complete learning pipeline. After running "Task" module, it ends without any error or warning. But when I use the Pangu package, it can go through the pipeline. The log is as following:

[2023-03-10 02:00:39] DEBUG: Logging queue listener started! [2023-03-10 02:01:10] INFO: 1 samples loaded

My Application is the subclass of "TMStandaloneApp". LearningSystem is the subclass of "TMSystem". Learner is the subclass of "TMSuperviseLearner". "TMSuperviseLearner" is imported from tripmaster.core.components.operator.supervise.

My config yaml is as following:

config: io: input: task: train_sample_ratio_for_eval: 0 serialize: save: false path: ${job.startup_path}/doc_hoia_task_data.pkl load: false

  train_sample_ratio_for_eval:  0
    save: false
    path: ${job.startup_path}/doc_hoia_problem_data.pkl
    load: false

launcher: type: local strategies: local:

job: ray_tune: false

startup_path: "" testing: false test: validate: False sample_num: 10 epoch_num: 10 batching: type: fixed_size strategies: fixed_size: batch_size: 1 drop_last: False

parallel: single

dataloader: worker_num: 0 # load data using multi-process pin_memory: false timeout: 0 resource_allocation_range: 10000 drop_last: False train_eval_sampling_ratio: 0 resource: computing: cpu_per_trial: 1 cpus: 4 gpu_per_trial: 0 gpus: 0 memory: inferencing_memory_limit: 1000 learning_memory_limit: 1000 distributed: "no" metric_logging: type: tableprint strategies: tableprint: { } tensorboard: path: "metrics"

system: serialize: save: true path: ${job.startup_path}/doc_hoia.system.pkl

load: false

task: evaluator: {} # define raw evaluator? tp_modeler:

  require_words: True
  provide_words: True
  add_bert_tokens: True
  spacy_language_pack: "en_core_web_sm"
    pretrained_tokenizer_path: "ernie-3.0-base-zh" #"ernie-3.0-mini-zh"

problem: evaluator: machine: arch: pretrained: model_path: "ernie-3.0-base-zh" #"ernie-3.0-mini-zh" voc_size: null decoder: all_copy: true anno_hidden_size: 768 arc_hidden_size: 128 beam_size: 1 cross_attn: false dropout: 0 input_size: 768 rel_hidden_size: 768 edge_embedding_dims: 128 label2id_path: ${job.startup_path}/label2id.yaml loss: interpolation: 0.5 alpha: 1.0 beta: 1.0 lamb: 1.0 evaluator: average: "weighted" num_edge_types: 67 learner: optimizer: strategy: epochs: 1 algorithm: pretrained_embedding: lr: 5e-5 decoder: lr: 1e-4

    gamma: 0.9
  gradient_clip_val: 1.
  stage: "problem"
  channel: "dev"
  metric: "span_split_prediction.Acc"
  # metric: "span_type_prediction.Accuracy"
  better: "max"
  save_prefix: "best"
  interval: 1

repo: server: "" local_dir: ${job.startup_path}/pangu

rudaoshi commented 1 year ago

I think your System class should subclass the TMSuperviseSystem rather than the TMSystem class.