deeppavlov / DeepPavlov

An open source library for deep learning end-to-end dialog systems and chatbots.
https://deeppavlov.ai
Apache License 2.0
6.72k stars 1.15k forks source link

building complete gobot.json without separate slotfiller.json and ner.json #664

Closed vitalyuf closed 1 year ago

vitalyuf commented 5 years ago

Hi! I have a gobot, based on dstc2 gobot example. gobot.json references to slofiller.json and slotfiller.json references to ner.json. There is a problem of embedders duplicating (one for ner and one for gobot needed) at runtime resulting in excessive memory usage. So I supposed, that it is possible to build one big.json, made of 3 above jsons. Is it possible? How to do it?

yoptar commented 5 years ago

Hi there! I could not find embeddings usage in ner_dstc2.json nor in slotfill_dstc2.json. Intents config uses fasttext embeddings, but [gobot config)(https://github.com/deepmipt/DeepPavlov/blob/master/deeppavlov/configs/go_bot/gobot_dstc2.json) that references it does not.

If you do want to combine multiple configurations into one, you should replace component description that has config_path parameter with pieline elements from the referenced configuration file, properly connecting inputs and outputs. Then, if you have multiple identical components, you can replace later uses with refs as described in documentation.

vitalyuf commented 5 years ago

Sorry for siminformation. I built ner component based on ner_rus.json. It uses fasttext embedder and gobot_dstc2_best.json uses it too. And, yes, intents classifier uses.

The example in documentation shows a case when both elements are in the same pipeline. But, for example, embedder in gobot_dstc2_best.json is not a pipeline element, it is a parameter of go_bot element. And slotfiller is a go_bot parameter too.

Let's consider an example:

      {
        "in": ["x"],
        "in_y": ["y"],
        "out": ["y_predicted"],
        "main": true,
        "class_name": "go_bot",
        "load_path": "{MODELS_PATH}/my_gobot_rus/model",
        "save_path": "{MODELS_PATH}/my_gobot_rus/model",
        "debug": false,
        "word_vocab": "#token_vocab",
        "template_path": "{DOWNLOADS_PATH}/dstc2_v2_rus/dstc2-templates.txt",
        "template_type": "DualTemplate",
        "database": "#phone_database",
        "api_call_action": "api_call",
        "use_action_mask": false,
        "network_parameters": {
          "learning_rate": 0.002,
          "end_learning_rate": 0.00002,
          "decay_steps": 10,
          "decay_power": 0.5,
          "dropout_rate": 0.45,
          "l2_reg_coef": 2e-3,
          "hidden_size": 128,
          "dense_size": 64,
          "attention_mechanism": {
            "type": "cs_bahdanau",
            "hidden_size": 32,
            "depth": 3,
            "action_as_key": true,
            "max_num_tokens": 100,
            "projected_align": false
          }
        },
        **"slot_filler": {
          "config_path": "{CONFIGS_PATH}/ner/slotfill_dstc2_rus_colab.json"
        },**
        "intent_classifier": "{CONFIGS_PATH}/classifiers/intents_dstc2_big_rus.json",
        **"embedder": {
          "class_name": "fasttext",
          "load_path": "{DOWNLOADS_PATH}/embeddings/lenta_lower_100.bin"
        },**
        "bow_embedder": null,
        "tokenizer": {
          "class_name": "stream_spacy_tokenizer",
          "lowercase": false
        },
        "tracker": {
          "class_name": "featurized_tracker",
          "slot_names": ["surname", "name", "pos_confirm", "neg_confirm", "phone"]
        }
      }

Have I understood right that to include a slotfiller.json pipeline into a gobot_dstc2_best.json I should take a "pipe" component of file: {CONFIGS_PATH}/ner/slotfill_dstc2_rus_colab.json. And copy-paste the content of "pipe" it to config_path instead of "{CONFIGS_PATH}/ner/slotfill_dstc2_rus_colab.json"?

yoptar commented 5 years ago

Hmmm. No, sorry, i don't think it will work like that. For now there is no way to use pipelines as a component constructor argument without using a config reference... I'll think about what we can do about it.

oserikov commented 4 years ago

Hey! The separation of GO-bot units onto separate pipeline units is planned though is not implemented yet.