Open Serg2DFX opened 5 months ago
@Serg2DFX thanks for reporting this bug!
For us to repro the issue and test the fix, could you provide more information on the repro steps?
For playback, you can use the localization ru-Ru, fr-Fr or any other language with a comma separator.
Sample repro steps:
Screen in wizards:
in UI (dotted format)
and logs (comma format):
Debug: generate-project[0]
03:04:44.23 0 ExecuteAsync Started
Trace: generate-project[0]
03:04:44.24 0 ReplaceToken:<conda_env_name> jsonTokenValue:phi-2-env
Trace: generate-project[0]
03:04:44.24 0 ReplaceToken:<prompt_template> jsonTokenValue:### Text: {}\n### The tone is:\n
Trace: generate-project[0]
03:04:44.24 0 ReplaceToken:<data_configs_data_files> jsonTokenValue:dataset/dataset-classification.json
Trace: generate-project[0]
03:04:44.24 0 ReplaceToken:<data_configs_split> jsonTokenValue:train
Trace: generate-project[0]
03:04:44.24 0 ReplaceToken:<dataset_type> jsonTokenValue:corpus
Trace: generate-project[0]
03:04:44.24 0 ReplaceToken:<text_cols> jsonTokenValue:[
"phrase",
"tone"
]
Trace: generate-project[0]
03:04:44.24 0 ReplaceToken:<text_template> jsonTokenValue:### Text: {phrase}\n### The tone is:\n{tone}
Trace: generate-project[0]
03:04:44.24 0 ReplaceToken:<line_by_line> jsonTokenValue:join
Trace: generate-project[0]
03:04:44.24 0 ReplaceToken:<source_max_len> jsonTokenValue:1024
Trace: generate-project[0]
03:04:44.24 0 ReplaceToken:<pad_to_max_len> jsonTokenValue:false
Trace: generate-project[0]
03:04:44.24 0 ReplaceToken:<compute_dtype> jsonTokenValue:bfloat16
Trace: generate-project[0]
03:04:44.24 0 ReplaceToken:<quant_type> jsonTokenValue:nf4
Trace: generate-project[0]
03:04:44.24 0 ReplaceToken:<double_quant> jsonTokenValue:true
Trace: generate-project[0]
03:04:44.24 0 ReplaceToken:<lora_r> jsonTokenValue:64
Trace: generate-project[0]
03:04:44.24 0 ReplaceToken:<lora_alpha> jsonTokenValue:64
Trace: generate-project[0]
03:04:44.24 0 ReplaceToken:<lora_dropout> jsonTokenValue:0,1
Trace: generate-project[0]
03:04:44.24 0 ReplaceToken:<eval_dataset_size> jsonTokenValue:0,3
Trace: generate-project[0]
03:04:44.24 0 ReplaceToken:<training_args_seed> jsonTokenValue:0
Trace: generate-project[0]
03:04:44.24 0 ReplaceToken:<training_args_data_seed> jsonTokenValue:42
Trace: generate-project[0]
03:04:44.24 0 ReplaceToken:<per_device_train_batch_size> jsonTokenValue:1
Trace: generate-project[0]
03:04:44.24 0 ReplaceToken:<per_device_eval_batch_size> jsonTokenValue:1
Trace: generate-project[0]
03:04:44.24 0 ReplaceToken:<gradient_accumulation_steps> jsonTokenValue:4
Trace: generate-project[0]
03:04:44.24 0 ReplaceToken:<gradient_checkpointing> jsonTokenValue:false
Trace: generate-project[0]
03:04:44.24 0 ReplaceToken:<learning_rate> jsonTokenValue:0,0001
Trace: generate-project[0]
03:04:44.24 0 ReplaceToken:<num_train_epochs> jsonTokenValue:3
Trace: generate-project[0]
03:04:44.24 0 ReplaceToken:<max_steps> jsonTokenValue:1200
Trace: generate-project[0]
you can check the formatting by analogy with: https://www.w3schools.com/jsref/tryit.asp?filename=tryjsref_tolocalestring_num_all
I create project microsoft/phi-2 and get generated conten (json files - not valid) see scrinshot:![image](https://github.com/microsoft/windows-ai-studio/assets/5922647/552e47cc-2576-4331-9de4-277c7e84f575)
please use in codegen InvariantCulture