zhongkaifu / Seq2SeqSharp

Seq2SeqSharp is a tensor based fast & flexible deep neural network framework written by .NET (C#). It has many highlighted features, such as automatic differentiation, different network types (Transformer, LSTM, BiLSTM and so on), multi-GPUs supported, cross-platforms (Windows, Linux, x86, x64, ARM), multimodal model for text and images and so on.
Other
193 stars 38 forks source link

What are the next seq2seqSharp examples? #22

Closed GeorgeS2019 closed 1 year ago

GeorgeS2019 commented 3 years ago

Here are the categories of HuggingFace transformer examples

Girl in a jacket

It seems you have started to build more .NET seq2seqSharp transformer examples (e.g. Text Classification). Perhaps the HuggingFace example categories could guide what could be the next seq2seqSharp examples the .NET community needs :-)

zhongkaifu commented 3 years ago

Hi @GeorgeS2019,

Thanks for your suggestion. I would rather like you or Seq2SeqSharp users or who is interested in it to vote which is the next example they would like to have. :)

In addition, for some examples, such as summarization, q&a, Seq2SeqSharp already support them in framework and code level, the only thing needed to build these examples are data.

Thanks Zhongkai Fu

GeorgeS2019 commented 3 years ago

@zhongkaifu .NET ML communities are converging just like the path to .NET6.

Through TorchSharp, a strategy to "Convert" trained network of pyTorch to TorchSharp .NET friendly model data has been developed.

Perhaps this could speed up having more Seq2SeqSharp transformer examples with pretrained networks based on HuggingFace transformer models based on PyTorch.

@zhongkaifu if you want to grant me a Christmas gift, please focus on a Seq2seqSharp Text generation example e.g. GPT-2 with data come from pyTorch implementation of GPT-2

We already have a .NET GPT-2 version with c# BPE tokenizer based on PythonNET

==> In other WORDS @zhongkaifu there is NO MORE EXCUE of not having enough data to provide seq2seqSharp pretrained models :-) .

zhongkaifu commented 3 years ago

yeah~ In fact, Seq2SeqSharp is already able to train GPT-2 model in framework level (Some code for randomly token sampling and masking needs to be implemented, but they are pretty simple. :) ), however, because pre-trained models don't work very well in my current works, (very long sequence), I'm not currently focusing on this part. But if anyone is interested in it, it would be really appreciated. :)

For Seq2SeqSharp, the next thing I may invest many time to do is to support ONNX which can break barrier between Seq2SeqSharp and other frameworks.

zhongkaifu commented 1 year ago

A standalone GPT decoder and console tool is here: https://github.com/zhongkaifu/Seq2SeqSharp/commit/7fb859e418dbf8d7261660d144c0c81ddd9334ad