Closed vitalyuf closed 1 year ago
Hi there! I could not find embeddings usage in ner_dstc2.json nor in slotfill_dstc2.json. Intents config uses fasttext embeddings, but [gobot config)(https://github.com/deepmipt/DeepPavlov/blob/master/deeppavlov/configs/go_bot/gobot_dstc2.json) that references it does not.
If you do want to combine multiple configurations into one, you should replace component description that has config_path
parameter with pieline elements from the referenced configuration file, properly connecting inputs and outputs. Then, if you have multiple identical components, you can replace later uses with refs as described in documentation.
Sorry for siminformation. I built ner component based on ner_rus.json. It uses fasttext embedder and gobot_dstc2_best.json uses it too. And, yes, intents classifier uses.
The example in documentation shows a case when both elements are in the same pipeline. But, for example, embedder in gobot_dstc2_best.json is not a pipeline element, it is a parameter of go_bot element. And slotfiller is a go_bot parameter too.
Let's consider an example:
{
"in": ["x"],
"in_y": ["y"],
"out": ["y_predicted"],
"main": true,
"class_name": "go_bot",
"load_path": "{MODELS_PATH}/my_gobot_rus/model",
"save_path": "{MODELS_PATH}/my_gobot_rus/model",
"debug": false,
"word_vocab": "#token_vocab",
"template_path": "{DOWNLOADS_PATH}/dstc2_v2_rus/dstc2-templates.txt",
"template_type": "DualTemplate",
"database": "#phone_database",
"api_call_action": "api_call",
"use_action_mask": false,
"network_parameters": {
"learning_rate": 0.002,
"end_learning_rate": 0.00002,
"decay_steps": 10,
"decay_power": 0.5,
"dropout_rate": 0.45,
"l2_reg_coef": 2e-3,
"hidden_size": 128,
"dense_size": 64,
"attention_mechanism": {
"type": "cs_bahdanau",
"hidden_size": 32,
"depth": 3,
"action_as_key": true,
"max_num_tokens": 100,
"projected_align": false
}
},
**"slot_filler": {
"config_path": "{CONFIGS_PATH}/ner/slotfill_dstc2_rus_colab.json"
},**
"intent_classifier": "{CONFIGS_PATH}/classifiers/intents_dstc2_big_rus.json",
**"embedder": {
"class_name": "fasttext",
"load_path": "{DOWNLOADS_PATH}/embeddings/lenta_lower_100.bin"
},**
"bow_embedder": null,
"tokenizer": {
"class_name": "stream_spacy_tokenizer",
"lowercase": false
},
"tracker": {
"class_name": "featurized_tracker",
"slot_names": ["surname", "name", "pos_confirm", "neg_confirm", "phone"]
}
}
Have I understood right that to include a slotfiller.json pipeline into a gobot_dstc2_best.json I should take a "pipe" component of file: {CONFIGS_PATH}/ner/slotfill_dstc2_rus_colab.json. And copy-paste the content of "pipe" it to config_path instead of "{CONFIGS_PATH}/ner/slotfill_dstc2_rus_colab.json"?
Hmmm. No, sorry, i don't think it will work like that. For now there is no way to use pipelines as a component constructor argument without using a config reference... I'll think about what we can do about it.
Hey! The separation of GO-bot units onto separate pipeline units is planned though is not implemented yet.
Hi! I have a gobot, based on dstc2 gobot example. gobot.json references to slofiller.json and slotfiller.json references to ner.json. There is a problem of embedders duplicating (one for ner and one for gobot needed) at runtime resulting in excessive memory usage. So I supposed, that it is possible to build one big.json, made of 3 above jsons. Is it possible? How to do it?