explosion / spaCy

💫 Industrial-strength Natural Language Processing (NLP) in Python
https://spacy.io
MIT License
29.69k stars 4.36k forks source link

Guide to component_cfg parameters. #5513

Closed dixitishan811 closed 4 years ago

dixitishan811 commented 4 years ago

My question is that is there any comprehensive guide to the what the parmeters of component_cfg signify for example I understand the common parameters such as beam_width and other cnn parameters but dont know what for example 'nr_feature_tokens' means.

Also does the training model use CNN and then a layer of bilstm ,and how will the performance be affected if I increase the bilstm depth parameter ?

One more doubt can I use custom optimizer for example RAdam for training new spacy model ?

It would be really helpful if someone could clear my doubt regarding the above questions related to the architecture of the training model of spacy.

component_cfg={
                    "ner": {
                        "beam_width": 1,
                        "beam_density": 0.0,
                        "beam_update_prob": 1.0,
                        "cnn_maxout_pieces": 3,
                        "nr_feature_tokens": 6,
                        "nr_class": 10,
                        "hidden_depth": 1,
                        "token_vector_width": 96,
                        "hidden_width": 64,
                        "maxout_pieces": 2,
                        "pretrained_vectors": null,
                        "bilstm_depth": 0,
                        "self_attn_depth": 0,
                        "conv_depth": 4,
                        "conv_window": 3,
                        "embed_size": 2000
                    }

Your Environment

svlandeg commented 4 years ago

is there any comprehensive guide to the what the parmeters of component_cfg signify

Unfortunately I don't think we currently have documented all these internal parameter somewhere extensively, no. To add insult to injury, it is currently not always clear what the exact values of these parameters are for a given model, as there are multiple places in the code where we're defining default values or overwriting existing ones.

These are issues we are currently fixing for the release of spaCy v.3. A config file will be used that holds all these parameter values and gives the user much more control over them. Additionally, we will make certain to document them properly for this release. Currently, v.3 is in development on the develop branch.

Also does the training model use CNN and then a layer of bilstm ,and how will the performance be affected if I increase the bilstm depth parameter ?

It really depends on the size of your dataset and the complexity of the modeling tasks, whether you need additional BiLSTM layers or not. This setting is used to define the Tok2Vec layer of the parser, see here, and they are indeed chained after the CNN layers. You'll need PyTorch installed to make this work.

Can I use custom optimizer for example RAdam for training new spacy model ?

This is another feature that will be so much easier to implement in v.3: you'll just be able to define the preferred optimizer in your config file. In v.2 you also have control over this though: nlp.update() can be called with an sgd object you create. If it is None, a default one is created and stored internally. But instead you could for instance pass RAdam as defined by Thinc here.

Hope that helps!

dixitishan811 commented 4 years ago

is there any comprehensive guide to the what the parmeters of component_cfg signify

Unfortunately I don't think we currently have documented all these internal parameter somewhere extensively, no. To add insult to injury, it is currently not always clear what the exact values of these parameters are for a given model, as there are multiple places in the code where we're defining default values or overwriting existing ones.

These are issues we are currently fixing for the release of spaCy v.3. A config file will be used that holds all these parameter values and gives the user much more control over them. Additionally, we will make certain to document them properly for this release. Currently, v.3 is in development on the develop branch.

Also does the training model use CNN and then a layer of bilstm ,and how will the performance be affected if I increase the bilstm depth parameter ?

It really depends on the size of your dataset and the complexity of the modeling tasks, whether you need additional BiLSTM layers or not. This setting is used to define the Tok2Vec layer of the parser, see here, and they are indeed chained after the CNN layers. You'll need PyTorch installed to make this work.

Can I use custom optimizer for example RAdam for training new spacy model ?

This is another feature that will be so much easier to implement in v.3: you'll just be able to define the preferred optimizer in your config file. In v.2 you also have control over this though: nlp.update() can be called with an sgd object you create. If it is None, a default one is created and stored internally. But instead you could for instance pass RAdam as defined by Thinc here.

Hope that helps!

Thanks for the reply,will be waiting for v3 update and yes it will be really helpful if any documentation is added regarding the internal structure of the model in the future.

github-actions[bot] commented 2 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.