Open voldemortX opened 3 years ago
The maintainers have rather low bandwidth these days, about half features for Q1 remain unfinished and are pushed to Q2. Any help would be much appreciated!
I am interested in helping to support this library. I am in a position to add support for comma10k as well as GAN-based weather augmentations soon. I have previously trained this on BDD100K: https://github.com/hustvl/YOLOP
I really do not recommend this architecture because despite seeming very attractive, being very flexible in training, and learning all the tasks well, I could not convert it to TensorRT. I opened issues both there and in the tensorrtx repository but the HKUST students did not help me at all, and many files were missing from their repository.
See here:
https://github.com/hustvl/YOLOP/issues/12 https://github.com/wang-xinyu/tensorrtx/issues/793
I would also like to move these two papers into the Roadmap:
https://arxiv.org/abs/2105.05003 CondLaneNet https://arxiv.org/abs/2105.13680 FOLOLane
I am a MS in AI Candidate at Boston University (taking semester off because courses are useless), I see that you are a student at SJTU @voldemortX I am a huge fan of your university and have read many papers from there. I regard them as the #1 world leader's in many computer vision application fields particularly surveillance. I would enjoy working with you.
Right now I have trained DDRNet for real-time semantic segmentation on comma10k dataset. I think comma10k dataset has huge use for the community because it is fully permissive so we can augment it with labels it is missing / new formats etc. I will update when I can submit some code, I will release it slightly after I build it into my pipeline to give my company a slight edge before I make it open-source. I do not have much experience with pull requests but I will do my best.
@SikandAlex It will be an honor to have your help as well!
I am interested in helping to support this library. I am in a position to add support for comma10k as well as GAN-based weather augmentations soon. I have previously trained this on BDD100K: https://github.com/hustvl/YOLOP
Any new supports on datasets is welcomed!
I would also like to move these two papers into the Roadmap:
https://arxiv.org/abs/2105.05003 CondLaneNet https://arxiv.org/abs/2105.13680 FOLOLane
The CondLaneNet is open-sourced and could be easier to implement. While FOLOLane might prove a harder method that need more work, since we do not yet have one of its backbones (BiSeNet).
I'll add them in the Roadmap for Q4 and they can of course continue into 22. You can submit PRs whenever you have a ready-to-go bunch of codes (e.g. implemented one of the dataset class and tested its loading, or finished an algorithm). Thanks again for your help!
TensorRT support is also what @cedricgsh and I have talked about recently. We too agree that pytorch-auto-drive should not stop at a research codebase. Our primal aim would be a TensorRT benchmark for model speed and op-based FLOPs calculation from fvcore. But given our current bandwidth, I think that would need to wait until 22Q1 at least (a refactor of the framework might be required).
I have an AGX Xavier on hand that I will hopefully be able to provide some benchmarks on for certain models. Unfortunately I'm no expert at TensorRT custom layers etc and some operations it seems are unable to be supported by many developers.
There is so many papers that claim to get certain FPS on deploy to embedded GPUs but they never release their code, they only release testing code and no training code, the results are not reproducible even if there is training code, so really it is a huge mess. I have spent the past 2 months trying to determine the best papers and best approach as of Fall 2021 given my computational limitations.
After reading as many papers as I can over the past 3 months, driving around in my friend's Tesla, seeing many other models (amazing AI research from Asia is just destroying us here in US in my opinion), there are not that many critical components to Level 2 highway autonomy (which should be the first step).
1) Object Detection for Vehicles/Pedestrians
2) Segmentation of Driveable Surface / Road
After much research I narrowed down the candidate models to the following:
3) Lane Detection (Segmentation with Post-Processing OR Polynomial/Keypoint/Row-Wise/Transformer/Non-Segmentation etc)
I have found that real-time segmentation approaches in regard to lane markings need to incorporate high-resolution features and not resize the input image before doing anything as many networks do. This is because the semantics of the lane lines beyond a certain distance from the vehicle are lost at lower resolution and then you can't predict the path far enough in advance. I also do not know how to post-process the segmentation based approach properly as I have not yet experimented with DBSCAN or RANSAC. From all my readings, this shows the most potential to me but it seems to require the 4-lane probability output rather than just the binary mask for lane segmentation I am currently producing:
https://github.com/czming/RONELD-Lane-Detection https://arxiv.org/abs/2010.09548
When I tested the models at https://github.com/Turoad/lanedet#Benchmark-and-model-zoo the CondLaneNet model performed the best which is why I recommended it be placed on the road map although it uses a heavy Resnet-101 backbone.
4) Depth (Monocular/Stereo, ideally monocular)
PyDNet
MobileStereoNet
MiDaS
MonoDepth(2/Wavelet)
FastDepth
LapDepth
HITNet 5) 3D Object Detection (Maybe can replace Depth)
FCOS3D
FCOS3D++/PGD
DD3D
6) Bird's Eye View / Top View / Projection Transform
Finally, this is the best Github project that I've been able to find related to self-driving. All the models seem to come from OpenVINO / PINO model library.
https://github.com/iwatake2222/self-driving-ish_computer_vision_system
I'm not sure what this model is trained on, or what architecture, but it seems to work well? https://docs.openvino.ai/2018_R5/_docs_Transportation_segmentation_curbs_release1_caffe_desc_road_segmentation_adas_0001.html
On the control side of things, MPC solver is solution to use for latitude/longitude control.
This is the extensive information I have been able to collect. I automatically discarded models that didn't have code implementations but I think I could have made a mistake here or there. I was going to keep all this information to myself but all I do is code non-stop all day and have no life and still only make slow progress. My friends just work at company's like Facebook and Google and don't wan't to work on anything exciting. Impossible to get them to give up stock to work with me, also has a learning curve. So hopefully by giving back to the open-source community, they will give back to me and we can all improve science and also make money.
Well my knowledge about self-driving is kind of only in the research stage for now. And I certainly learned a lot from your comments. Though I'm still skeptical about deep learning's performance in actual applications of self-driving, especially on RGB inputs. So now what I do in SenseTime is more about human-centric applications.
Btw, we'll release our own lane detector later, perhaps early 22 at the latest. It end-to-end achieves a reasonable performance (~75 CULane, ~95 LLAMAS) at 150 FPS in PyTorch with a small model, which we believe could be beneficial to application. But it needs to remain in a private repo for now.
@SikandAlex We have now pushed initial supports for ONNX and TensorRT conversions, maybe they can be helpful to your applications? Refer to DEPLOY.md.
Roadmap for our users (to state feature requests) and contributors.
* Low priority tasks.
2022Q? (2022.4.2 - ):
2022Q1 (2022.1.10 - 2022.3.31):
2021Q4 (2021.10.1 - 2021.12.31):
2021Q3 (2021.7.1 - 2021.9.30):
2021Q2 (2021.4.1 - 2021.6.30):
Add ERFNet-PRNetAwaiting more info*Add ENet-SADUnable to re-implement*Explore ERFNet-SADUnable to re-implement2021Q1 (-2021.3.31):
Add RESA (VGG16, ResNets)Explore ERFNet-RESAAdd ResNet18-LSTRTry add a LSTR that is directly comparable with other methods, i.e. on a common backboneAdd ERFNet-PRNetAdd ENet-SAD*Explore ERFNet-SAD*Add ERFNet training from scratchSupport BDD100K*Support LLAMAS