chonyy / AI-basketball-analysis

:basketball::robot::basketball: AI web app and API to analyze basketball shots and shooting pose.
https://ai-basketball-analysis.herokuapp.com/
Other
972 stars 181 forks source link

AI basketball analysis on google colab #9

Open hardik0 opened 4 years ago

hardik0 commented 4 years ago

hey @chonyy I have tried this on google colab check this out OR AI-basketball-analysis-on-google-colab

i'm facing an issue on server(google colab) video was processed, but in web browser the video stream is lagging. Please help me to solve this issue.

Thanks! for the great work.

chonyy commented 4 years ago

Hi @hardik0 First of all, I really like your work! I have been looking for ways for people to try it without cloning it on their local machine with a CUDA GPU. And I think your work is the best way to do it.

Unfortunately, I'm not really familiar with the Google colab environment and I ran into some problem getting your project to work. I can successfully reach the last cell and run it, but it refuses to connect on the web browser. colab refuse

chonyy commented 4 years ago

For your problem, I have two advice to give you. Since I can't get your project into work, could you try it and tell me if it works?

  1. It's lagging because there's tensorflow==1.15.2 in my requirements.txt. And I'm pretty sure this will make TensorFlow only run with CPU and not GPU. I have noticed that there's a default TensorFlow and CUDA setup on your Google colab environment. So, try to remove TensorFlow in the requirements.txt, this should enable the project run with default TensorFlow-GPU.

  2. I'm pretty sure you don't have to manually install OpenPose by yourself, I have included all the required lib in my repo(tell me if I'm wrong). Could you try to remove the whole OpenPose installation section and see if the project still works?

hardik0 commented 4 years ago

Hi @hardik0 First of all, I really like your work! I have been looking for ways for people to try it without cloning it on their local machine with a CUDA GPU. And I think your work is the best way to do it.

Unfortunately, I'm not really familiar with the Google colab environment and I ran into some problem getting your project to work. I can successfully reach the last cell and run it, but it refuses to connect on the web browser. colab refuse You open the wrong link, just open **.ngrok.io link

hardik0 commented 4 years ago

For your problem, I have two advice to give you. Since I can't get your project into work, could you try it and tell me if it works?

  1. It's lagging because there's tensorflow==1.15.2 in my requirements.txt. And I'm pretty sure this will make TensorFlow only run with CPU and not GPU. I have noticed that there's a default TensorFlow and CUDA setup on your Google colab environment. So, try to remove TensorFlow in the requirements.txt, this should enable the project run with default TensorFlow-GPU.
  2. I'm pretty sure you don't have to manually install OpenPose by yourself, I have included all the required lib in my repo(tell me if I'm wrong). Could you try to remove the whole OpenPose installation section and see if the project still works?
  1. I tried latest version of tensorflow, werkzeug, matplotlib, gunicorn, Pillow but this issue doesn't resolve.
  2. I treid but doesn't work, We need to completely re-compile the whole openpose. here is the reason https://github.com/CMU-Perceptual-Computing-Lab/openpose/issues?q=libopenpose.so+cannot+open+shared+object+file Screenshot (1282)

Now I'm facing another issue. This problem occurs sometimes on some videos.

filename sample_video.mp4
filepath ./static/uploads/sample_video.mp4
127.0.0.1 - - [20/Jun/2020 08:04:47] "POST /sample_analysis HTTP/1.1" 200 -
filename sample_video.mp4
filepath ./static/uploads/sample_video.mp4
127.0.0.1 - - [20/Jun/2020 08:04:47] "POST /sample_analysis HTTP/1.1" 200 -
127.0.0.1 - - [20/Jun/2020 08:04:48] "GET /static/css/main.css HTTP/1.1" 200 -
127.0.0.1 - - [20/Jun/2020 08:04:48] "GET /static/img/basketball-icon.jpg HTTP/1.1" 200 -
/usr/local/python
Starting OpenPose Python Wrapper...
Auto-detecting all available GPUs... Detected 1 GPU(s), using 1 of them starting at GPU 0.
[ERROR:1] global /io/opencv/modules/videoio/src/cap.cpp (116) open VIDEOIO(CV_IMAGES): raised OpenCV exception:

OpenCV(4.2.0) /io/opencv/modules/videoio/src/cap_images.cpp:293: error: (-215:Assertion failed) !_filename.empty() in function 'open'

2020-06-20 08:04:50.120606: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-20 08:04:50.121233: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:00:04.0 name: Tesla T4 computeCapability: 7.5
coreClock: 1.59GHz coreCount: 40 deviceMemorySize: 14.73GiB deviceMemoryBandwidth: 298.08GiB/s
2020-06-20 08:04:50.121302: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-06-20 08:04:50.121376: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-06-20 08:04:50.121570: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-06-20 08:04:50.121616: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-06-20 08:04:50.121667: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-06-20 08:04:50.121724: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-06-20 08:04:50.121755: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-06-20 08:04:50.121864: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-20 08:04:50.122502: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-20 08:04:50.123056: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-06-20 08:04:50.123103: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-06-20 08:04:50.123119: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108]      0 
2020-06-20 08:04:50.123133: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0:   N 
2020-06-20 08:04:50.123284: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-20 08:04:50.123924: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-20 08:04:50.124511: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5428 MB memory) -> physical GPU (device: 0, name: Tesla T4, pci bus id: 0000:00:04.0, compute capability: 7.5)
127.0.0.1 - - [20/Jun/2020 08:04:50] "GET /video_feed HTTP/1.1" 500 -
Error on request:
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/werkzeug/serving.py", line 304, in run_wsgi
    execute(self.server.app)
  File "/usr/local/lib/python3.6/dist-packages/werkzeug/serving.py", line 294, in execute
    for data in application_iter:
  File "/usr/local/lib/python3.6/dist-packages/werkzeug/wsgi.py", line 506, in __next__
    return self._next()
  File "/usr/local/lib/python3.6/dist-packages/werkzeug/wrappers/base_response.py", line 45, in _iter_encoded
    for item in iterable:
  File "/content/AI-basketball-analysis-on-google-colab/src/app_helper.py", line 80, in getVideoStream
    shooting_result['avg_elbow_angle'] = round(mean(shooting_pose['elbow_angle_list']), 2)
  File "/usr/lib/python3.6/statistics.py", line 311, in mean
    raise StatisticsError('mean requires at least one data point')
statistics.StatisticsError: mean requires at least one data point
chonyy commented 4 years ago

Now I'm facing another issue. This problem occurs sometimes on some videos.

Hi @hardik0, I believe this is because of the lack of exception handling. I apologize for being lazy.

This problem occurs because it couldn't get the required data from the input video, something went wrong with the calculation.

To solve this problem, I suggest you to start testing it with sample analysis or any sample video in ./static/uploads directory. I have checked the program can definitely work with those videos. And I'm still working for the project to work with more variety of input videos.

chonyy commented 4 years ago
  1. I tried latest version of tensorflow, werkzeug, matplotlib, gunicorn, Pillow but this issue doesn't resolve.

Hi @hardik0, somehow I still think this problem is because of TensorFlow is running with CPU, could you make sure that you have installed tensorflow-gpu? Also try to uninstall normal tensorflow before hosting it, this could avoid the project run with CPU.

I uninstalled normal tensorflow and installed tensorflow-gpu, and I got a fairly huge improvement on the FPS. Although I don't know why it's still a little slower than my local GTX 1060(about 20 FPS), it should be workable. I don't know if you are satisfied with this efficiency. I'm still working on this project to change the model to YOLOv4, which should be able to significantly improve the efficiency. uninstall

  1. I treid but doesn't work, We need to completely re-compile the whole openpose. here is the reason https://github.com/CMU-Perceptual-Computing-Lab/openpose/issues?q=libopenpose.so+cannot+open+shared+object+file Screenshot (1282)

Thanks a lot for telling me this! This sounds like bad news for me. Does this mean that no one could run my project without recompiling the whole OpenPose? (which takes about half an hour)

hardik0 commented 4 years ago

Thanks! @chonyy,

TensorFlow is running with CPU, could you make sure that you have installed tensorflow-gpu? Also try to uninstall normal tensorflow before hosting it, this could avoid the project run with CPU.

As per my knowledge tensorflow 2.2.x comes with gpu support, but I also tried tensorflow-gpu version https://www.tensorflow.org/install/gpu

I'm still working on this project to change the model to YOLOv4, which should be able to significantly improve the efficiency.

I would love to see YOLOv4

Does this mean that no one could run my project without recompiling the whole OpenPose? (which takes about half an hour)

Unfortunately Yes! Because we are using OpenPose with Python API if we use command line interface then no additional installation is required.

chonyy commented 4 years ago

Hi @hardik0 , is your problem solved?

As per my knowledge tensorflow 2.2.x comes with gpu support, but I also tried tensorflow-gpu version https://www.tensorflow.org/install/gpu

I have run the code below in utils.py tf.disable_v2_behavior() I don't know if this would affect the gpu support in TF2.

I uninstalled tensorflow and installed tensorflow-gpu and I run your project, I think the FPS is fairly enough(about 15 ~ 20). I wonder how laggy it was when you run it and how fast you expect it to be?

hardik0 commented 4 years ago

on server side (google colab) video was processed, but in web browser the video stream is lagging.

I think the issue is not related to tensorflow, but something else

chonyy commented 4 years ago

Alright, I think I could finally reproduce your problem. The program already outputs the data of the third shot on the terminal, but the video stream is still rendering the first shot. This is actually really weird.

Are you able to run my project on your local machine? (not on Google colab) I'm pretty sure this problem will not occur if you run it on your local machine.

I think this is because the .ngrok.io website hosted by Google colab is somehow really unstable. Sometimes I could barely connect to the webpage, and even disconnected from it. This could explain why the video was processed**, but not successfully rendering on the webpage. Unfortunately, I have no idea why this is happening. Maybe you could try to host a simple flask page with Google colab and check if there's any stability problem.

hardik0 commented 4 years ago

Are you able to run my project on your local machine? (not on Google colab)

I don't have a gpu

I found out another way Screenshot (1318) reference: https://stackoverflow.com/a/61504116 https://stackoverflow.com/a/54760106

For cell output add this to app.py & use !cat app.log command to view log

import logging
logger = logging.getLogger('my_logger')

logging.basicConfig(
    filename='app.log', # write to this file
    filemode='a', # open in append mode
    format='%(name)s - %(levelname)s - %(message)s'
    )

logging.warning('This will get logged to a file')

and change path(/usr/local/python) in src/utils.py

p.s - clone your repository

chonyy commented 4 years ago

I found out another way

Good to see you find a way to work around. Is this problem solved after trying this new method?

Also, can I have your updated ipynb? I will really like to provide a link to your project in my README. This is truly a great work from you. And I'm so glad to see people appreciate my work and even try to make it better.

hardik0 commented 4 years ago

hey @chonyy, I have updated the colab notebook and issue is resolved! check this out