zhongkaifu / Seq2SeqSharp

Seq2SeqSharp is a tensor based fast & flexible deep neural network framework written by .NET (C#). It has many highlighted features, such as automatic differentiation, different network types (Transformer, LSTM, BiLSTM and so on), multi-GPUs supported, cross-platforms (Windows, Linux, x86, x64, ARM), multimodal model for text and images and so on.
Other
193 stars 38 forks source link

sentencepiece.dll problem in the API #57

Closed piedralaves closed 1 year ago

piedralaves commented 1 year ago

Dear Zhongkai,

We have a problem running the API. We are running a sequence to sequence topology.

When the procedure Startup is fired, and then the statement "var srcSentPiece = new SentencePiece( Configuration[ "Seq2Seq:SrcSentencePieceModelPath", a library called sentencepiece is needed.

We put the sentencepiece.dll in the bin folder, but this doesn`t work.

The fact is that our model has not been trained with that tokenizer, so for us is useless.

Two questions:

why "SrcSentencePieceModelPath" and "TgtSentencePieceModelPath" are mandatory? How can we load sentencepiece.dll in your API project?

` public Startup( IConfiguration configuration ) { int maxTestSrcSentLength; int maxTestTgtSentLength; ProcessorTypeEnums processorType; string deviceIds;

        Configuration = configuration;

        if ( !Configuration[ "Seq2Seq:ModelFilePath" ].IsNullOrEmpty())
        {
            Logger.WriteLine( $"Loading Seq2Seq model '{Configuration[ "Seq2Seq:ModelFilePath" ]}'" );

            var modelFilePath = Configuration[ "Seq2Seq:ModelFilePath" ];
            maxTestSrcSentLength = Configuration[ "Seq2Seq:MaxSrcTokenSize" ].ToInt();
            maxTestTgtSentLength = Configuration[ "Seq2Seq:MaxTgtTokenSize" ].ToInt();
            processorType = Configuration[ "Seq2Seq:ProcessorType" ].ToEnum< ProcessorTypeEnums >();
            deviceIds = Configuration[ "Seq2Seq:DeviceIds" ];

            **var srcSentPiece = new SentencePiece( Configuration[ "Seq2Seq:SrcSentencePieceModelPath" ] );
            var tgtSentPiece = new SentencePiece( Configuration[ "Seq2Seq:TgtSentencePieceModelPath" ] );**

            Seq2SeqInstance.Initialization( modelFilePath, maxTestSrcSentLength, maxTestTgtSentLength, 
                                            processorType, deviceIds, (srcSentPiece, tgtSentPiece) );
        }

` Many thanks

piedralaves commented 1 year ago

Can we cancel the effect of that dll? Is sentencepiece important? I mean, can we comment the lines where that library works and fix the whole code in order to allow the answer without its intervention?

We did it and the service seems to work propertly. It returns the sequences as well.

Thanks

zhongkaifu commented 1 year ago

Hi @piedralaves

SentencePiece is optional. You can safely comment it out if you don't want to use it. I will update it in SeqWebAPIs

If you want to use SentencePiece, you can either: 1) Use sentencepiece dll or lib file under C:\Works\Projects\Seq2SeqSharp\dll or 2) Build sentencepiece using code in Seq2SeqSharp\ExternalProjects\SentencePiece.zip or 3) git clone https://github.com/zhongkaifu/sentencepiece , build it locally and copy dll or lib files to the same folder of SeqWebAPIs

Because the original sentencepiece doesn't have interface for C# calling, so it's a modified version in Seq2SeqSharp.

Let me know if you have any further questions.

Thanks Zhongkai Fu