Closed mwangistan closed 1 year ago
Hi @mwangistan,
Thank you for your feedback. The CK2 (CM) concept is to provide wrappers (CM scripts) for main DevOps/MLOps tasks and let the community gradually add portability across all SW/HW stacks. Some of the CM scripts support all OSes (Linux, MacOS, Windows) and some are not.
We have added and tests CM script for MLPerf inference for Linux and MacOS but we did not fully ported it to Windows because there was no requests. Can you please say why would you like to run MLPerf inference on Windows and if you will be interested to help the community by providing missing support for Windows?
Please feel free to join our Discord server to discuss that: https://discord.gg/JjWNWXKxwT .
Thank you and have a good week, Grigori
Hi @gfursin , I appreciate the feedback. I was looking into comparing inference speeds across Linux, Mac and Windows. Are there any scripts that currently work on windows that I can start with to get a feel of it?
I would be happy to help the community. Thanks
Hi @mwangistan - I would like to check these scripts with you on my Windows machine.
Will you be interested to set up a conf-call?
By the way, I created a Collective Knowledge challenge to run MLPerf inference v3.1 on Windows: https://access.cknowledge.org/playground/?action=challenges&name=53e56d714c7649c7
Great, thanks. I've sent you a message on discord to schedule the call.
Hi Stanley, It was nice talking to you this week. As we discussed, please share a README about how you ran MLPerf inference on Windows first without CM/CK and later we can work together to add support for CM/CK. Note, that we are resuming our regular weekly conf-calls starting from this Thursday: https://github.com/mlcommons/ck/blob/master/docs/taskforce.md#weekly-conf-calls . Feel free to join to share your feedback. Thanks!
Hi @gfursin. Thanks for having the chat. I've created a fork with some changes I had made to the mlperf source repo. You can find a readme here: https://github.com/mwangistan/inference/blob/master/vision/classification_and_detection/GettingStarted.md I'll join the call this week to share my feedback. Thanks
Great! Thank you very much @mwangistan . Let's check it during the conf-call tomorrow!
Hi @mwangistan . Due to the overlap with the MLCommons community meeting, we decided to cancel our weekly conf-call tomorrow and will have another one next week ... Sorry about that and looking forward to talking to you soon!
Hi @gfursin sounds good. Thanks
Hi @mwangistan,
I've managed to run MLPerf on Windows following these docs (https://github.com/mwangistan/inference/blob/master/vision/classification_and_detection/GettingStarted.md and https://github.com/mwangistan/inference/blob/master/vision/classification_and_detection/README.md#usage) and reusing RetinaNet ONNX model from the CM cache:
D:\grigori\inference\vision\classification_and_detection>python python/main.py --profile retinanet-onnxruntime --scenario Offline --model D:\CM\repos\local\cache\b5df9a3024564ba1\resnext50_32x4d_fpn.onnx --dataset-path D:\grigori\inference\vision\classification_and_detection\downloaded_dataset --accuracy
INFO:main:Namespace(dataset='openimages-800-retinanet-onnx', dataset_path='D:\\grigori\\inference\\vision\\classification_and_detection\\downloaded_dataset', dataset_list=None, data_format=None, profile='retinanet-onn
xruntime', scenario='Offline', max_batchsize=1, model='D:\\Work1\\CM\\repos\\local\\cache\\b5df9a3024564ba1\\resnext50_32x4d_fpn.onnx', output='output', inputs=['images'], outputs=['boxes', 'labels',
'scores'], backend='onnxruntime', model_name='retinanet', threads=8, qps=None, cache=0, cache_dir=None, preprocessed_dir=None, use_preprocessed_dataset=False, accuracy=True, find_peak_performance=Fals
e, debug=False, mlperf_conf='../../mlperf.conf', user_conf='user.conf', audit_conf='audit.config', time=None, count=None, performance_sample_count=None, max_latency=None, samples_per_query=8)
INFO:coco:loaded 100 images, cache=0, already_preprocessed=False, took=0.0sec
C:\!Progs\Python39\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py:54: UserWarning: Specified provider 'CUDAExecutionProvider' is not in available provider names.Available providers: 'CPUExecutionProvider'
warnings.warn(INFO:main:starting TestScenario.Offline
loading annotations into memory...
Done (t=0.01s)
creating index...
index created!
Loading and preparing results...
Converting ndarray to lists...
(12228, 7)
0/12228
DONE (t=0.07s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=1.50s).
Accumulating evaluation results...
DONE (t=0.63s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.439
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.610
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.452
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.030
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.240
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.496
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.455
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.597
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.618
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.152
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.410
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.666
TestScenario.Offline qps=0.96, mean=69.5604, time=104.445, acc=44.194%, mAP=43.860%, queries=100, tiles=50.0:75.1461,80.0:101.4416,90.0:107.0209,95.0:108.2511,99.0:113.3382,99.9:114.7423
It should be possible to support MLPerf for Windows in our CM workflows then! I will try to update CM workflows soon and will get back in touch!
Hi @gfursin that's great to hear. Looking forward to it. Thanks
Hi again @mwangistan .
I've added support for Windows for scripts required for MLPerf-RetinaNet. I also added a CM test script to run it on Windows: https://github.com/mlcommons/ck/tree/master/cm-mlops/script/test-mlperf-inference-retinanet-win .
You can try it as follows:
python3 -m pip install cmind
cm pull repo mlcommons@ck
cm run script "test mlperf-inference-win retinanet windows"
I didn't add the full support for Windows to our universal CM-MLPerf inference script since it will require much more updates and we don't have time to do it right now. But if it's of interest, you can use it as example and maybe update other scripts ... We can discuss it during one of our conf-calls. Thank you very much for your feedback and have a good weekend!
Hi @gfursin thanks for adding the support. I'm getting an error when running
cm run script "test mlperf-inference-win retinanet windows
Error: Running python.exe setup.py develop bindings/python_api.cc(27): fatal error C1083: Cannot open include file: 'pybind11/functional.h': No such file or directory
CM error: Portable CM script failed (name = get-mlperf-inference-loadgen, return code = 1)
Hi @mwangistan . Just a note, did you clean the cache and update the mlcommons@ck repo? Can you please try again:
python -m pip uninstall mlperf_loadgen
cm pull repo mlcommons@ck
cm rm cache -f
cm run script "test mlperf-inference-win retinanet windows"
Otherwise, I think CM picks up older loadgen version without a patch ...
I just checked above steps and it worked on my Windows 10 with Python 3.9.6 and Visual Studio Community Edition 2022.
However, if it doesn't work, then I will check it again ... Can you please then send the whole log from the start? Thanks!
@gfursin It works now. I hadn't cleared the cache. Thanks
Cool! I close this ticket. Thank you for your feedback @mwangistan !
Trying to run
cm run script --tags=app,vision,language,mlcommons,mlperf,inference,generic --json=true
on windows 11. I'm getting an error that the CM script isn't supported on windows yet. Not sure if I'm missing anything. Thanks