Open LittleLittleCloud opened 5 months ago
Can you guys publish a preview for Microsoft.ML.GenAI.LLaMA package?
@lostmsu You should be able to consume it from the daily build below
Oh, just notice that the GenAI package hasn't been set to IsPackable
to true so it's not available on daily build. Will publish a PR to enable the package flag
Can you please publish a preview for Microsoft.ML.GenAI.Core package? It is not available
https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-libraries/nuget/v3/index.json
The sample Microsoft.ML.GenAI.Samples/Llama/LLaMA3_1.cs is broken without it .
Furthermore, the sample has hard coded weight folder
var weightFolder = @"C:\Users\xiaoyuz\source\repos\Meta-Llama-3.1-8B-Instruct";
I have downloaded the model and config from Meta site. May be a few comments will be helpful.
Can you please publish a preview for Microsoft.ML.GenAI.Core package? It is not available
https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-libraries/nuget/v3/index.json
The sample Microsoft.ML.GenAI.Samples/Llama/LLaMA3_1.cs is broken without it .
Furthermore, the sample has hard coded weight folder var weightFolder = @"C:\Users\xiaoyuz\source\repos\Meta-Llama-3.1-8B-Instruct"; I have downloaded the model and config from Meta site. May be a few comments will be helpful.
Oh, sorry, I'll make the fix
I am getting System.IO.FileNotFoundException couldn't find model.safetensors.index.json calling at Microsoft.ML.GenAI.LLaMA.LlamaForCausalLM.FromPretrained(String modelFolder, String configName, String checkPointName, ScalarType torchDtype, String device) I can't get the example working, please explain where/what this file is?
@aforoughi1 Which llama, I suppose you are runnning llama 3.2 1B?
Llama3.1-8B
@aforoughi1
The error basically say it can't find the {ModelFolder}/model.safetensors.index.json
, could you share the full code to call the model, stacktrace and a screenshot of the llama 3.1 8B model folder
// issue 7169 //Meta-Llama-3.1-8B-Instruct/orginial string weightFolder = @"C:\Users\abbas.llama\checkpoints\Llama3.1-8B"; string configName = "params.json"; string modelFile = "tokenizer.model";
TiktokenTokenizer tokenizer = LlamaTokenizerHelper.FromPretrained(weightFolder, modelFile); LlamaForCausalLM model = LlamaForCausalLM.FromPretrained(weightFolder, configName, layersOnTargetDevice: -1 ,targetDevice: "cpu"); Console.WriteLine("Loading Llama from model weight folder");
var pipeline = new CausalLMPipeline<TiktokenTokenizer, LlamaForCausalLM>(tokenizer, model, "cpu");
System.IO.FileNotFoundException
HResult=0x80070002
Message=Could not find file 'C:\Users\abbas.llama\checkpoints\Llama3.1-8B\model.safetensors.index.json'.
Source=System.Private.CoreLib
StackTrace:
at Microsoft.Win32.SafeHandles.SafeFileHandle.CreateFile(String fullPath, FileMode mode, FileAccess access, FileShare share, FileOptions options)
at Microsoft.Win32.SafeHandles.SafeFileHandle.Open(String fullPath, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize)
at System.IO.Strategies.OSFileStreamStrategy..ctor(String path, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize)
at System.IO.Strategies.FileStreamHelpers.ChooseStrategyCore(String path, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize)
at System.IO.Strategies.FileStreamHelpers.ChooseStrategy(FileStream fileStream, String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options, Int64 preallocationSize)
at System.IO.StreamReader.ValidateArgsAndOpenPath(String path, Encoding encoding, Int32 bufferSize)
at System.IO.File.InternalReadAllText(String path, Encoding encoding)
at System.IO.File.ReadAllText(String path)
at TorchSharp.PyBridge.PyBridgeModuleExtensions.load_checkpoint(Module module, String path, String checkpointName, Boolean strict, IList1 skip, Dictionary
2 loadedParameters, Boolean useTqdm)
at Microsoft.ML.GenAI.LLaMA.LlamaForCausalLM.FromPretrained(String modelFolder, String configName, String checkPointName, ScalarType torchDtype, String device)
at Test.GenAITest.LLaMATest1() in C:\Users\abbas\OneDrive\Documents\WorkingProgress\MLStcokMarketPrediction\Test\GenAITest.cs:line 35
at Test.Program.GenAI() in C:\Users\abbas\OneDrive\Documents\WorkingProgress\MLStcokMarketPrediction\Test\Program.cs:line 425
at Test.Program.Main(String[] args) in C:\Users\abbas\OneDrive\Documents\WorkingProgress\MLStcokMarketPrediction\Test\Program.cs:line 54
@aforoughi1
LlamaForCausalLM
loads .safetensor
model weight while in your code, you are targeting the original .pth
model weight folder.
The .safetensor
model weight should be located in Meta-Llama-3.1-8B-Instruct
, maybe update the weight folder to that path when loading LlamaForCausalLM?
LlamaForCausalLM model = LlamaForCausalLM.FromPretrained("Meta-Llama-3.1-8B-Instruct", configName, layersOnTargetDevice: -1 ,targetDevice: "cpu");
I sorted the following missing files and the directory structure: model.safetensors.index model-00004-of-00004.safetensors model-00001-of-00004.safetensors model-00002-of-00004.safetensors model-00003-of-00004.safetensors model-00004-of-00004.safetensors
The model is loaded successfully ONLY if I use the defaults layersOnTargetDevice: -1, quantizeToInt8: false quantizeToInt4 = false
Setting layersOnTargetDevice: 26, quantizeToInt8: true causes memory corruptions exception.
The example also missing stopWatch.Stop();
I also don't see RegisterPrintMessage(), print any messages to the console.
@aforoughi1 Are you using nightly build or trying the example from main branch
Nightly buildOn 7 Oct 2024, at 17:20, Xiaoyun Zhang @.***> wrote: @aforoughi1 Are you using nightly build or trying the example from main branch
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>
@aforoughi1 And your GPU device/platform?
Device is set
torch.InitializeDeviceType(DeviceType.CPU);
microsoft.ml.genai.llama\0.22.0-preview.24477.3\
microsoft.ml.torchsharp\0.21.1\
torchsharp-cpu\0.103.0\
Processor 12th Gen Intel(R) Core(TM) i5-1235U 2.50 GHz
Installed RAM 16.0 GB (15.8 GB usable)
System type 64-bit operating system, x64-based processor
Edition Windows 11 Home
Version 23H2
OS build 22631.4249
Experience Windows Feature Experience Pack 1000.22700.1041.0
From: Xiaoyun Zhang @.> Sent: 07 October 2024 17:26 To: dotnet/machinelearning @.> Cc: Abbas Foroughi @.>; Mention @.> Subject: Re: [dotnet/machinelearning] Add GenAI packages (Issue #7169)
@aforoughi1 https://github.com/aforoughi1 And your GPU device/platform?
— Reply to this email directly, view it on GitHub https://github.com/dotnet/machinelearning/issues/7169#issuecomment-2397380734 , or unsubscribe https://github.com/notifications/unsubscribe-auth/ALUPR57BV5YWNQBCJISZCBDZ2KYZTAVCNFSM6AAAAABI5KARSSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOJXGM4DANZTGQ . You are receiving this because you were mentioned.Message ID: @.***>
The layersOnTargetDevice
is for GPU, so I haven't test values other than -1
in CPU scenario. For the quantizeToInt8
and quantizeToInt4
, you probably also won't gain benefits on CPU scenarios. So maybe just keep it as false.
Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
Describe the solution you'd like The
GenAI
packages will provide torchsharp implementation for a series of popular GenAI models. The goal is to load the same weight from the corresponding python regular model.Microsoft.ML.GenAI.Core
(#7177)The following models will be added in the first wave
Microsoft.ML.GenAI.Phi
) #7184Microsoft.ML.GenAI.Phi
project #7206Microsoft.ML.GenAI.LLaMA
) #7220Microsoft.ML.GenAI.Mistral
)Microsoft.ML.GenAI.StableDiffusion
)MEAI intergration
Along with the benchmark
[ ] Benchmark for Phi-3
[ ] Flash Attention support #7238
Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.
Additional context Add any other context or screenshots about the feature request here.