zhongkaifu / Seq2SeqSharp

Seq2SeqSharp is a tensor based fast & flexible deep neural network framework written by .NET (C#). It has many highlighted features, such as automatic differentiation, different network types (Transformer, LSTM, BiLSTM and so on), multi-GPUs supported, cross-platforms (Windows, Linux, x86, x64, ARM), multimodal model for text and images and so on.
Other
193 stars 38 forks source link

Example for Seq2Seq for sequence-classification #36

Closed ic202 closed 2 years ago

ic202 commented 2 years ago

I would love to use Seq2SeqSharp for Multi-intent classification for Chatbot. Unfortunately I do not understand your examples offered here. I also can't find the test file. Is there a c# demo project somewhere that explains this problem.

I must mention that my experiences with AI are just using Microsoft ML.Net.

thank you in advance

zhongkaifu commented 2 years ago

Hi @ic202 For classification task, you should use SeqClassificationConsole for it and it uses two input files for training. For your chatbot classification task, one file may contain each turn per line and the other one file contains the corresponding intention of the turn. For example:

File: train.src.snt (each turn per line) How are you ? I 'm going to Seattle , WA . Do you like this movie ? ...

File train.tgt.snt (the corresponding intention) Greeting Travel Entertainment ...

For test, it just uses one file including turns.

In above example, it only uses a single group feature (word tokens in turn) in train.src.snt. If you have multi feature groups, you can split them by [tab] character. For example: Do you like this move ? \t [the context of previous turns]

For demos, SeqClassificationConsole currently isn't included into the release package, but I can think about and create a demo project for it. Do you have any demo data set that can be included into the release package ?

ic202 commented 2 years ago

first of all thanks for the quick reply. Of course I can send you test data. However, it may be better to explain to you how my Recognizer works now:

  1. I am currently using Microsoft ml.net
  2. I pass the sentence through 5 Models and at the end I form one Intent. Basically this works really well. I have about 12000 sentences and ml.net easily recognizes over 1000 Intents. The only problem is that I can only recognize one intent in a sentence.

So if the user sends a message: "I was at the Belvedere Hotel last year, I want to book again this year, can you send me an offer for this year" my system can only recognize one intent say "i_was_at_the_hotel" of course in this case I need 3 Intents "i_was_at_the_hotel", "want_to_book", "send_offer"

So what interests me is is this problem possible to solve with seq2seqSarp? If possible I hope I don't need millions of test data :)

As for the test data, I am happy to send it to you. I just have to supplement the sentences with 2 or more intents. (like I said the data has only one intent.) Is 1000 sentences with one intent and 100 sentences with 2 intents enough. Ah, let's not forget my data is in German is that ok ? :)

zhongkaifu commented 2 years ago

Based on your description, I think this is a sequence labeling task rather than sequence classification task. Then input and output of your example could be: input token \t output tag I \t B was \t M at \t M the \t M Belvedere \t Hotel \t E last \t O year \t O , \t O I \t O want \t B to \t M book \t E again \t O this \t O year \t O , \t O can \t O you \t O send \t B me \t M an \t M offer \t E for \t O this \t O year \t O

B means the beginning of the tag, M means the middle of the tag and E means the end of the tag. O means other. So, you could see sequences "I was at the Belvedere Hotel", "want to book" and "send me an offer" are labeled by this model.

For intention classification task, you should use more general tags rather specific tags. For example: "i_was_at_the_hotel" can be "LOCATION_HOTEL", and "he_was_at_the_hotel" should also use the same tag.

For multi-intention classification, you should use softmax as your top layer, because it naturally support multi-classification.

For model, the release package currently doesn't include demo for sequence labeling task, but I will add it later.

ic202 commented 2 years ago

Training data is not a problem to prepare in this format. We can set labels for the beginning and end of the sentence. However, I can't prepare the prediction data like this because I don't know where the beginning and end of the middle sentence are. In what format is the predication data?

The example I gave is only hastily written and does not match our data. But I'm sure I use a lot more Indent than I might need. That's why, at the moment, I have several models.

zhongkaifu commented 2 years ago

For test data, it should have the first column only (tokens), and model will predict tags.

ic202 commented 2 years ago

ok then i will wait for you to make a demo. Do you need demo data from me?

zhongkaifu commented 2 years ago

Thanks. No, I will use public data set for demo.

zhongkaifu commented 2 years ago

Added demo for sequence labeling task. You can take a look: https://github.com/zhongkaifu/Seq2SeqSharp/releases/tag/RELEASE_2_4_1 try to run test_ner_enu.bat