dotnet / machinelearning

ML.NET is an open source and cross-platform machine learning framework for .NET.
https://dot.net/ml
MIT License
9.05k stars 1.88k forks source link

Add GenAI packages #7169

Open LittleLittleCloud opened 5 months ago

LittleLittleCloud commented 5 months ago

Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like The GenAI packages will provide torchsharp implementation for a series of popular GenAI models. The goal is to load the same weight from the corresponding python regular model.

The following models will be added in the first wave

MEAI intergration

Along with the benchmark

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context Add any other context or screenshots about the feature request here.

lostmsu commented 2 months ago

Can you guys publish a preview for Microsoft.ML.GenAI.LLaMA package?

LittleLittleCloud commented 2 months ago

@lostmsu You should be able to consume it from the daily build below

Oh, just notice that the GenAI package hasn't been set to IsPackable to true so it's not available on daily build. Will publish a PR to enable the package flag

aforoughi1 commented 1 month ago

Can you please publish a preview for Microsoft.ML.GenAI.Core package? It is not available

https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-libraries/nuget/v3/index.json

The sample Microsoft.ML.GenAI.Samples/Llama/LLaMA3_1.cs is broken without it .

Furthermore, the sample has hard coded weight folder
var weightFolder = @"C:\Users\xiaoyuz\source\repos\Meta-Llama-3.1-8B-Instruct"; I have downloaded the model and config from Meta site. May be a few comments will be helpful.

LittleLittleCloud commented 1 month ago

Can you please publish a preview for Microsoft.ML.GenAI.Core package? It is not available

https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-libraries/nuget/v3/index.json

The sample Microsoft.ML.GenAI.Samples/Llama/LLaMA3_1.cs is broken without it .

Furthermore, the sample has hard coded weight folder var weightFolder = @"C:\Users\xiaoyuz\source\repos\Meta-Llama-3.1-8B-Instruct"; I have downloaded the model and config from Meta site. May be a few comments will be helpful.

Oh, sorry, I'll make the fix

aforoughi1 commented 1 month ago

I am getting System.IO.FileNotFoundException couldn't find model.safetensors.index.json calling at Microsoft.ML.GenAI.LLaMA.LlamaForCausalLM.FromPretrained(String modelFolder, String configName, String checkPointName, ScalarType torchDtype, String device) I can't get the example working, please explain where/what this file is?

LittleLittleCloud commented 1 month ago

@aforoughi1 Which llama, I suppose you are runnning llama 3.2 1B?

aforoughi1 commented 1 month ago

Llama3.1-8B

LittleLittleCloud commented 1 month ago

@aforoughi1

The error basically say it can't find the {ModelFolder}/model.safetensors.index.json, could you share the full code to call the model, stacktrace and a screenshot of the llama 3.1 8B model folder

aforoughi1 commented 1 month ago

model folder

// issue 7169 //Meta-Llama-3.1-8B-Instruct/orginial string weightFolder = @"C:\Users\abbas.llama\checkpoints\Llama3.1-8B"; string configName = "params.json"; string modelFile = "tokenizer.model";

TiktokenTokenizer tokenizer = LlamaTokenizerHelper.FromPretrained(weightFolder, modelFile); LlamaForCausalLM model = LlamaForCausalLM.FromPretrained(weightFolder, configName, layersOnTargetDevice: -1 ,targetDevice: "cpu"); Console.WriteLine("Loading Llama from model weight folder");

var pipeline = new CausalLMPipeline<TiktokenTokenizer, LlamaForCausalLM>(tokenizer, model, "cpu");

System.IO.FileNotFoundException HResult=0x80070002 Message=Could not find file 'C:\Users\abbas.llama\checkpoints\Llama3.1-8B\model.safetensors.index.json'. Source=System.Private.CoreLib StackTrace: at Microsoft.Win32.SafeHandles.SafeFileHandle.CreateFile(String fullPath, FileMode mode, FileAccess access, FileShare share, FileOptions options) at Microsoft.Win32.SafeHandles.SafeFileHandle.Open(String fullPath, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize) at System.IO.Strategies.OSFileStreamStrategy..ctor(String path, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize) at System.IO.Strategies.FileStreamHelpers.ChooseStrategyCore(String path, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize) at System.IO.Strategies.FileStreamHelpers.ChooseStrategy(FileStream fileStream, String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options, Int64 preallocationSize) at System.IO.StreamReader.ValidateArgsAndOpenPath(String path, Encoding encoding, Int32 bufferSize) at System.IO.File.InternalReadAllText(String path, Encoding encoding) at System.IO.File.ReadAllText(String path) at TorchSharp.PyBridge.PyBridgeModuleExtensions.load_checkpoint(Module module, String path, String checkpointName, Boolean strict, IList1 skip, Dictionary2 loadedParameters, Boolean useTqdm) at Microsoft.ML.GenAI.LLaMA.LlamaForCausalLM.FromPretrained(String modelFolder, String configName, String checkPointName, ScalarType torchDtype, String device) at Test.GenAITest.LLaMATest1() in C:\Users\abbas\OneDrive\Documents\WorkingProgress\MLStcokMarketPrediction\Test\GenAITest.cs:line 35 at Test.Program.GenAI() in C:\Users\abbas\OneDrive\Documents\WorkingProgress\MLStcokMarketPrediction\Test\Program.cs:line 425 at Test.Program.Main(String[] args) in C:\Users\abbas\OneDrive\Documents\WorkingProgress\MLStcokMarketPrediction\Test\Program.cs:line 54

LittleLittleCloud commented 1 month ago

@aforoughi1 LlamaForCausalLM loads .safetensor model weight while in your code, you are targeting the original .pth model weight folder.

The .safetensor model weight should be located in Meta-Llama-3.1-8B-Instruct, maybe update the weight folder to that path when loading LlamaForCausalLM?

LlamaForCausalLM model = LlamaForCausalLM.FromPretrained("Meta-Llama-3.1-8B-Instruct", configName, layersOnTargetDevice: -1 ,targetDevice: "cpu");
aforoughi1 commented 1 month ago

I sorted the following missing files and the directory structure: model.safetensors.index model-00004-of-00004.safetensors model-00001-of-00004.safetensors model-00002-of-00004.safetensors model-00003-of-00004.safetensors model-00004-of-00004.safetensors

The model is loaded successfully ONLY if I use the defaults layersOnTargetDevice: -1, quantizeToInt8: false quantizeToInt4 = false

Setting layersOnTargetDevice: 26, quantizeToInt8: true causes memory corruptions exception.

The example also missing stopWatch.Stop();

I also don't see RegisterPrintMessage(), print any messages to the console.

LittleLittleCloud commented 1 month ago

@aforoughi1 Are you using nightly build or trying the example from main branch

aforoughi1 commented 1 month ago

Nightly buildOn 7 Oct 2024, at 17:20, Xiaoyun Zhang @.***> wrote: @aforoughi1 Are you using nightly build or trying the example from main branch

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

LittleLittleCloud commented 1 month ago

@aforoughi1 And your GPU device/platform?

aforoughi1 commented 1 month ago

Device is set

torch.InitializeDeviceType(DeviceType.CPU);

microsoft.ml.genai.llama\0.22.0-preview.24477.3\

microsoft.ml.torchsharp\0.21.1\

torchsharp-cpu\0.103.0\

Processor 12th Gen Intel(R) Core(TM) i5-1235U 2.50 GHz

Installed RAM 16.0 GB (15.8 GB usable)

System type 64-bit operating system, x64-based processor

Edition Windows 11 Home

Version 23H2

OS build 22631.4249

Experience Windows Feature Experience Pack 1000.22700.1041.0

From: Xiaoyun Zhang @.> Sent: 07 October 2024 17:26 To: dotnet/machinelearning @.> Cc: Abbas Foroughi @.>; Mention @.> Subject: Re: [dotnet/machinelearning] Add GenAI packages (Issue #7169)

@aforoughi1 https://github.com/aforoughi1 And your GPU device/platform?

— Reply to this email directly, view it on GitHub https://github.com/dotnet/machinelearning/issues/7169#issuecomment-2397380734 , or unsubscribe https://github.com/notifications/unsubscribe-auth/ALUPR57BV5YWNQBCJISZCBDZ2KYZTAVCNFSM6AAAAABI5KARSSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOJXGM4DANZTGQ . You are receiving this because you were mentioned.Message ID: @.***>

LittleLittleCloud commented 1 month ago

The layersOnTargetDevice is for GPU, so I haven't test values other than -1 in CPU scenario. For the quantizeToInt8 and quantizeToInt4, you probably also won't gain benefits on CPU scenarios. So maybe just keep it as false.