adrianboguszewski commented 1 year ago

The models listed below are for the GSoC 2023 prerequisite task only.

We provide several potential candidates. Please select only one which hasn't already been selected (look at the checkboxes and comments below). When you decide, assign a model to you by adding a comment with the model name. Then we will tick it to mark reserved.

If you struggle, you can reassign yourself to another non-taken model. However, we can do it only once.

When you create a PR, please follow the self-checklist below:

each function is described by docstrings and type hints
notebook contains explicit descriptions and explanatory diagrams
the notebook doesn't use any data (image, video, etc.) that is not CC4.0 licensed
there is a README.md file in consistent style (look at other notebooks)
the notebook is added to the main README
there are no grammar, punctuation or typo issues (use any free tool for that e.g. Grammarly)
there are no committed files besides notebook and readme (please use images or videos from data dir)
your PR doesn't change any other notebooks
all CI checks passed

Object detection:

[x] Yolov6 - https://github.com/meituan/YOLOv6 (@ahmd-nish)
[x] DAMO YOLO - https://github.com/tinyvision/DAMO-YOLO (@Muskan33)
[x] YoloX - https://github.com/Megvii-BaseDetection/YOLOX (@sawradip)
[x] RTMDet - https://github.com/open-mmlab/mmyolo/tree/main/configs/rtmdet (@AnuragMaiti)
[x] EfficientDet - https://github.com/google/automl/tree/master/efficientdet (@ashish-2005)
[x] CenterNet - https://github.com/xingyizhou/CenterNet/ (@rajuptvs)
[x] SSD MobileNet V2 http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_mobilenet_v2_320x320_coco17_tpu-8.tar.gz https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md (@AlexFierro9)
[x] FasterRCNN Inception ResNet v2 http://download.tensorflow.org/models/object_detection/tf2/20200711/faster_rcnn_inception_resnet_v2_640x640_coco17_tpu-8.tar.gz https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md (@Paulooh007)
[x] YOLOS https://huggingface.co/hustvl/yolos-tiny (@SandeepaDevin)
[x] DETR https://huggingface.co/facebook/detr-resnet-50 (@Tatwansh)
[x] YoloR https://github.com/WongKinYiu/yolor (@18yz153)
[x] YoloF https://github.com/megvii-model/YOLOF (@thegeek13242)
[x] NanoDet https://github.com/RangiLyu/nanodet (@sahilpmehra)
[x] UltraFace https://github.com/Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB (@JacketChenlll)
[x] YoloV7 Face https://github.com/derronqi/yolov7-face (@lucifertrj)
[x] yolov5-blazeface https://github.com/openvinotoolkit/openvino_notebooks/blob/main/notebooks/205-vision-background-removal (@AnuragTimilsina)
[x] RetinaFace https://github.com/biubug6/Pytorch_Retinaface (@VaillaRohit)

Rotated object detection:

[ ] Rotated FCOS - https://github.com/open-mmlab/mmrotate/blob/main/configs/rotated_fcos/README.md
[x] ReDet - https://github.com/open-mmlab/mmrotate/blob/main/configs/redet/README.md (@nischay7)
[ ] Roi_trans - https://github.com/open-mmlab/mmrotate/blob/main/configs/roi_trans/README.md

Semantic Segmentation:

[x] SegFormer - https://huggingface.co/nvidia/segformer-b0-finetuned-ade-512-512 (@Kasliwal17)
[x] ClipSeg - https://huggingface.co/CIDAS/clipseg-rd64-refined (@RishithaR-388)
[x] SETR - https://github.com/fudan-zvg/SETR (@AniketARS)
[x] BeIT - https://huggingface.co/microsoft/beit-base-finetuned-ade-640-640 (@hadyy17)
[x] Segmenter - https://github.com/open-mmlab/mmsegmentation/tree/master/configs/segmenter (@blaz-r)
[x] DeepLab V3 - https://github.com/tensorflow/models/tree/master/research/deeplab (@chaitravi-ce)
[x] FaceParsing |& MakeUp - https://github.com/zllrunning/face-parsing.PyTorch https://github.com/zllrunning/face-makeup.PyTorch (@Lj1ang)
[x] ESPNet https://github.com/sacmehta/ESPNet (@Nouran-Muhammad)
[ ] YoloP https://github.com/hustvl/YOLOP

Instance Segmentation:

[x] YOLACT https://github.com/dbolya/yolact.git (@Abdullah-Elkasaby)
[x] Mask RCNN Inception ResNet V2 http://download.tensorflow.org/models/object_detection/tf2/20200711/mask_rcnn_inception_resnet_v2_1024x1024_coco17_gpu-8.tar.gz https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md (@mr-rajashekhar)

Action/Gesture recognition:

[x] TSM - https://github.com/mit-han-lab/temporal-shift-module (@ntombi)
[x] Timesformer - https://huggingface.co/facebook/timesformer-base-finetuned-k400 (@BrennoMello)
[x] SlowFast - https://github.com/open-mmlab/mmaction2/blob/master/configs/recognition/slowfast/README.md (@rajatkrishna)
[x] YOWOv2 - https://github.com/yjh0410/YOWOv2 (@Matrixmang0)
[x] movinet - https://github.com/tensorflow/models/tree/master/official/projects/movinet (@sharvesh642)
[ ] xclip https://huggingface.co/microsoft/xclip-base-patch32

Background matting:

[ ] ModNet - https://github.com/ZHKKKe/MODNet
[x] Robust Video Background matting - https://github.com/PeterL1n/RobustVideoMatting (@wulongjian)
[ ] MGMatting https://github.com/yucornetto/MGMatting
[ ] PortraitNet https://github.com/dong-x16/PortraitNet

Old Photos Restoration/Image colorization/Image denoising/super resolution:

[x] Bringing Old Photos Back to Life - https://github.com/microsoft/Bringing-Old-Photos-Back-to-Life (@Om-Doiphode )
[x] DeOldify - https://github.com/jantic/DeOldify (@Dhruvanshu-Joshi)
[x] Coltran - https://github.com/google-research/google-research/tree/master/coltran (@weronikazak)
[x] Colorizer https://github.com/richzhang/colorization (@pyther-hub)
[x] SwinIR - https://github.com/JingyunLiang/SwinIR (@Z-Fran)
[x] style-swapping https://github.com/irasin/Pytorch_Style_Swap (@m-gopichand)
[x] Real-ESRGAN (for real images) - https://github.com/xinntao/Real-ESRGAN/blob/master/docs/model_zoo.md (@aadhamm)
[ ] Real-ESRGAN (for animation video) - https://github.com/xinntao/Real-ESRGAN/blob/master/docs/model_zoo.md
[ ] RCAN https://github.com/yulunzhang/RCAN
[ ] Super-SlowMo https://github.com/rmalav15/Super-SloMo
[x] Photo2Cartoon https://github.com/minivision-ai/photo2cartoon (@sususama)

Depth estimation:

[ ] lite-mono https://github.com/noahzn/lite-monoc
[x] MiDaS 3.1 https://github.com/isl-org/MiDaS (@nsk126)
[x] Vi-Depth https://github.com/isl-org/VI-Depth (@pronoym99)

Text classification:

[x] Roberta - https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment @ABHIJATSARARI)
[x] XLM-Roberta - https://huggingface.co/papluca/xlm-roberta-base-language-detection (@hazrulakmal)
[x] DepRoBerta - https://huggingface.co/rafalposwiata/deproberta-large-depression (@SpyzzVVarun)
[x] CodeBerta - https://huggingface.co/huggingface/CodeBERTa-language-id (@zilto)
[x] Albert V2 - https://huggingface.co/textattack/albert-base-v2-MRPC (@dwipddalal)
[x] DistilRoberta - https://huggingface.co/j-hartmann/emotion-english-distilroberta-base (@MR-ENVYR)
[x] FinBERT https://huggingface.co/yiyanghkust/finbert-tone (@shrey-2803)
[x] Deberta https://huggingface.co/microsoft/deberta-base-mnli (@mhy-666)

Token classification:

[x] Part of speech tagging - https://huggingface.co/flair/pos-english (@harish2773)
[x] Punctuation restoring (bert-restore-punctuation) - https://huggingface.co/felflare/bert-restore-punctuation (@seanjyu)
[x] Punctuation restoring (punctuate-all) - https://huggingface.co/kredor/punctuate-all (@theNobody-12)
[x] Typo detection - https://huggingface.co/m3hrdadfi/typo-detector-distilbert-en (@Ravindu987)
[x] Named entity recognition - https://huggingface.co/elastic/distilbert-base-cased-finetuned-conll03-english (@Aditya-vardhan13)

Text generation:

[x] BioGPT https://huggingface.co/microsoft/biogpt (@sidyakinian)
[x] gpt-neo https://huggingface.co/EleutherAI/gpt-neo-125M (@Warlord-K)
[x] OPT https://huggingface.co/facebook/opt-350m (@zhumakhan)

Text Summarization

[x] DistilBART https://huggingface.co/sshleifer/distilbart-cnn-12-6 (@samycolen)

Question Answering

[x] MiniLM https://huggingface.co/deepset/minilm-uncased-squad2 (@Akshit17)
[x] ELECTRA https://huggingface.co/deepset/electra-base-squad2 (@sanjayk0508)
[x] DistilBert https://huggingface.co/distilbert-base-cased-distilled-squad (@fajemila)

Sound classification:

[x] speech emotions recognition (wav2vec) https://huggingface.co/harshit345/xlsr-wav2vec-speech-emotion-recognition (@paxF3E)
[x] Hubert key words spotting https://huggingface.co/superb/hubert-base-superb-ks (@100-87)

sahilpmehra commented 1 year ago

Hi, could you please assign me Rotated FCOS? Thanks!

Matrixmang0 commented 1 year ago

Hello everyone!

I would love to work on YOWOv2. If still unreserved, I request the admin to please reserve it for me.

Thank you

hadyy17 commented 1 year ago

Hello, as i have assigned BeIT to myself and i am currently working on developing a notebook. I has one question, for the pre-requisite task, do i need to modify the pre-trained model mentioned with certain usecase for openvino support or i can try making a model for my own usecase using BeIT and then add required support?

Any help is much apreciated.

adrianboguszewski commented 1 year ago

Could you assign CenterNet to me? Thanks you

@18yz153, unfortunately, CenterNet model has been chosen a few comments above you. Please select another one.

paxF3E commented 1 year ago

Hello, I shall be willing to work with Sound Classification:speech emotions recognition (wav2vec). Can I please have this model assigned to me? Thank you.

eaidova commented 1 year ago

Hello, as i have assigned BeIT to myself and i am currently working on developing a notebook. I has one question, for the pre-requisite task, do i need to modify the pre-trained model mentioned with certain usecase for openvino support or i can try making a model for my own usecase using BeIT and then add required support?

Any help is much apreciated.

@hadyy17 what do you mean under own use case? The most important thing that model should solve semantic segmentation task and running with openvino. If you can demonstrate own use case with preserving these conditions, it is fine. Generally, using model from link provided with the task should be enough. (it contains ADE20k trained model from HuggingFace hub)

matrixbot123 commented 1 year ago

Hi everyone ! I would like to work on DistilBART . Thank You

AlexFierro9 commented 1 year ago

Hi, I had assigned TimesFormer for myself, but I'm not able to convert the model to onnx correctly (some error with pytorch on my hardware), I've posted my issue on pytorch discuss but haven't gotten a response, can I change my model to a Tensorflow Based, if yes then please assign SSD MobileNet V2!

Akshit17 commented 1 year ago

Hello everyone! I would like to work on MiniLM. Can I have this model assigned to me? Thank you.

AniketARS commented 1 year ago

Hello Everyone, I would like to work on SETR - (https://github.com/fudan-zvg/SETR) model for my GSoc Application. I'm guessing it is unreserved, if anyone already working on this kindly inform me i will change my choice

Thanks, Aniket

Paulooh007 commented 1 year ago

Hi, Everyone. I would like to assign FasterRCNN Inception ResNet v2 to myself

fajemila commented 1 year ago

I would like to assign DistilBert to myself

Tatwansh commented 1 year ago

I would like to assign DETR to myself

SandeepaDevin commented 1 year ago

I would like to assign YOLOS to myself

18yz153 commented 1 year ago

I would like to assign YOLOR to myself

thegeek13242 commented 1 year ago

I would like to assign YoloF to myself.

nischay7 commented 1 year ago

Hi, I would like to assign ReDet to myself

zhumakhan commented 1 year ago

OPT Hi, I would like to assign OPT to myself.

Thanks

wulongjian commented 1 year ago

Hi, I would like to assign Roi_trans to myself. Thanks.

AlexFierro9 commented 1 year ago

Hi, There is an already built copy of SSD NET V2 at here at TF hub, and while the model can be rebuilt again from thesecheckpoints, available at TF model zoo too, This way also requires using TensorFlow Object Detection API that needs to be separately installed, to be able to recompile the model and save it. So should I use the already existing saved model or build it again?

eaidova commented 1 year ago

@AlexFierro9 you can use any of these modesl, but please pay attention that they are 2 different models, the first one in Tf Object detection API named as SSD MobileNet V2 FPNLite 320x320

hazrulakmal commented 1 year ago

Hi, this might seem like a dumb question. Does one need a machine with an Intel processor to meaningfully contribute to this organization and complete the prerequisite task with OpenVINO toolkits? Would someone with, say, a Ryzen processor be able to complete the task?

eaidova commented 1 year ago

@hazrulakmal unfortunately, there is no openvino prebuild packages for AMD processors, we can recommend using google colab or any other online jupyter notebooks environment to complete the prerequisite task

mhy-666 commented 1 year ago

Hi, can I assign Deberta to myself? Thanks.

pyther-hub commented 1 year ago

@eaidova sorry for the inconvenience but assign me NanoDet as there is no active documentation for yolo v7 face

sususama commented 1 year ago

Hi, I want to assign Photo2Cartoon.Thanks.

ryan-utopia commented 1 year ago

Hi, could you please assign me UltraFace? If I work on this successfully before March 20th, I will also try DAMO YOLO, Thanks!

Nouran-Muhammad commented 1 year ago

Hi, Could you please assign me ESPNet? Thank you

weronikazak commented 1 year ago

Hello, could you please assign me Coltran? Thank you!

wulongjian commented 1 year ago

Hello, I had assigned Roi_trans for myself, but I'm not able to convert it correctly in intel cpu platform. As far as I konw, this model only support nvidia inference currently(see this issue#551 and get_started).
Can I change my model to others? if yes then please assign Robust Video Background matting to me, thanks a lot.

AlexFierro9 commented 1 year ago

Hi, So I had a doubt with model conversion, I've been referring to the model conversion guide here Converting TensorFlow Object Detection API Models as according to the documentation and Colab tutorials available here. The mo command for conversion requires me to also specify --transformations_config for argument, unfortunately this model was not downloaded from tensorflow object detection api zoo, so I have no clue which file to use here, should I outright skip this parameter or should I use the json file specified for ssd_v2? btw my model is ssd_mobilenet_v2_320x320_coco17_tpu-8

shrey-2803 commented 1 year ago

Hi, I would like to assign FinBERT to myself. Thank You!

sharvesh642 commented 1 year ago

Hi @adrianboguszewski @eaidova , I was assigned with the movinet task and while I was trying to load the tensorflow model after converting it into OPENVINO IR I encountered an error - "Model file model/v3-small_224_1.0_float.xml cannot be opened!" When I was trying to Load the model in OpenVINO Runtime with ie.read_model I encountered the error - "Model file model/MiDaS_small.xml cannot be opened!". Am I doing something wrong how can I rectify it?

aadhamm commented 1 year ago

Hi, @adrianboguszewski @eaidova

I would like to assign "Real-ESRGAN (for real images)" to myself. Thank you for this insightful list and best of luck for all of us.

eaidova commented 1 year ago

@sharvesh642 which platform do you use for development? Please also check that your path to model is correct and both IR files (XML and BIN )are located in the same directory and have the same name without extension

eaidova commented 1 year ago

Hi, So I had a doubt with model conversion, I've been referring to the model conversion guide here Converting TensorFlow Object Detection API Models as according to the documentation and Colab tutorials available here. The mo command for conversion requires me to also specify --transformations_config for argument, unfortunately this model was not downloaded from tensorflow object detection api zoo, so I have no clue which file to use here, should I outright skip this parameter or should I use the json file specified for ssd_v2? btw my model is ssd_mobilenet_v2_320x320_coco17_tpu-8

@AlexFierro9 you can use any suitable SSD mobilenet v2 model which you can run and convert, so model selection should not be a problem, right? :)

TF object detection models saved using in saved model format correspond to TF2.0 and should work without pipeline config

pyther-hub commented 1 year ago

@eaidova My model is nanodet in that when I compile my model

from openvino.runtime import Core
core = Core()
model = core.read_model('/content/nanodet_model.xml')
compiled_model = core.compile_model(model, 'CPU')

after this

output_blob = compiled_model.output(0)
predictions = torch.from_numpy(compiled_model(input_tensor)[output_blob])

the shape of my predictions is 2125, 112 where 2125 is number of boxes now the main problem comes up of 112, ideally it should be 85 1+4+80 (one for probability of an object, 4 is the coordinate of the box, 80 is the prob. of each class)

after doing a ton of dugging I found out that the model returns this only after which it needs to be passed through post_process where it decodes the bboxes and all the functions are complex and inter linked I am not able to move from here like how can I get the required output from this 112

MR-ENVYR commented 1 year ago

Hello, My model is distilroberta. There is no tiny model for it so should I just use the model file available and write an export script for the model? I had a doubt as the example notebook used an export script which was available in the github repo of the model. So should I follow a similar approach of the export script or can I use any approach as convenient? Thanks in advance :D

eaidova commented 1 year ago

Hello, My model is distilroberta. There is no tiny model for it so should I just use the model file available and write an export script for the model? I had a doubt as the example notebook used an export script which was available in the github repo of the model. So should I follow a similar approach of the export script or can I use any approach as convenient? Thanks in advance :D

you can use any approach which can be suitable

eaidova commented 1 year ago

@eaidova My model is nanodet in that when I compile my model
from openvino.runtime import Core
core = Core()
model = core.read_model('/content/nanodet_model.xml')
compiled_model = core.compile_model(model, 'CPU')
after this
output_blob = compiled_model.output(0)
predictions = torch.from_numpy(compiled_model(input_tensor)[output_blob])
the shape of my predictions is 2125, 112 where 2125 is number of boxes now the main problem comes up of 112, ideally it should be 85 1+4+80 (one for probability of an object, 4 is the coordinate of the box, 80 is the prob. of each class)

after doing a ton of dugging I found out that the model returns this only after which it needs to be passed through post_process where it decodes the bboxes and all the functions are complex and inter linked I am not able to move from here like how can I get the required output from this 112

@pyther-hub Possibly, I do not quite understand what the issue is, what is the problem to apply post-processing to model output after openvino inference? It is normal practice that some models can require additional steps for getting results. e.g. if you look on yolov7 tutorial it is also requires to make non-maximum suppression for boxes after inference

mhy-666 commented 1 year ago

Hi, can I assign ELECTRA to myself? Thanks.

pyther-hub commented 1 year ago

@eaidova My model is nanodet in that when I compile my model
from openvino.runtime import Core
core = Core()
model = core.read_model('/content/nanodet_model.xml')
compiled_model = core.compile_model(model, 'CPU')
after this
output_blob = compiled_model.output(0)
predictions = torch.from_numpy(compiled_model(input_tensor)[output_blob])
the shape of my predictions is 2125, 112 where 2125 is number of boxes now the main problem comes up of 112, ideally it should be 85 1+4+80 (one for probability of an object, 4 is the coordinate of the box, 80 is the prob. of each class) after doing a ton of dugging I found out that the model returns this only after which it needs to be passed through post_process where it decodes the bboxes and all the functions are complex and inter linked I am not able to move from here like how can I get the required output from this 112
@pyther-hub Possibly, I do not quite understand what the issue is, what is the problem to apply post-processing to model output after openvino inference? It is normal practice that some models can require additional steps for getting results. e.g. if you look on yolov7 tutorial it is also requires to make non-maximum suppression for boxes after inference

ma'am the issue is the object returns a diff. output and it needs to be post processed but the post processing step is very very complex, I am trying to. implement that I have one more backup option which I will look later I would keep you posted regarding my progress, thank you for your response

pyther-hub commented 1 year ago

@eaidova I would like to change my model from nanodet to Colorizer I apologize for any inconvenience I may have caused. Despite my best efforts, I encountered difficulties in resolving the output issue. I reached out to several contributors via email, but one of them informed me that the code required for this conversion would be quite challenging to write. Moreover, the output I obtained did not match the model.forward result, which ultimately led me to a dead end.

mhy-666 commented 1 year ago

Hi, can I assign ELECTRA to myself? Thanks.

I'm sorry, I didn't notice the requirement that only one task can be assigned to a person. I have already completed the task that I previously received.

adrianboguszewski commented 1 year ago

@eaidova I would like to change my model from nanodet to Colorizer I apologize for any inconvenience I may have caused. Despite my best efforts, I encountered difficulties in resolving the output issue. I reached out to several contributors via email, but one of them informed me that the code required for this conversion would be quite challenging to write. Moreover, the output I obtained did not match the model.forward result, which ultimately led me to a dead end.

It's ok. Reassigned.

adrianboguszewski commented 1 year ago

Hi, can I assign ELECTRA to myself? Thanks.

I'm sorry, I didn't notice the requirement that only one task can be assigned to a person. I have already completed the task that I previously received.

Yes. One notebook implementation is enough. Please be sure you will have opened a PR (it's part of the task) with your work by 4th April.

pyther-hub commented 1 year ago

@adrianboguszewski sir I have made the notebook for coloriser task but while doing documentation I found that one [implementation of it is already there] (https://github.com/openvinotoolkit/openvino_notebooks/tree/main/notebooks/222-vision-image-colorization) , what shall I do now?

adrianboguszewski commented 1 year ago

@adrianboguszewski sir I have made the notebook for coloriser task but while doing documentation I found that one [implementation of it is already there] (https://github.com/openvinotoolkit/openvino_notebooks/tree/main/notebooks/222-vision-image-colorization) , what shall I do now?

Just create a PR. If it was listed above, it means we would like to have it anyway. When you create a PR, we will let you know about the next steps.

lucifertrj commented 1 year ago

Hi @adrianboguszewski ,

Please assign YoloV7 Face to me

AlexFierro9 commented 1 year ago

openvinotoolkit / openvino_notebooks

Models for the Google Summer of Code 2023 prerequisite task #832

923