neuroethology / MARS

End-user version of the Mouse Action Recognition System (MARS)
58 stars 10 forks source link

Missing directory in models folder #17

Closed neven-x closed 3 years ago

neven-x commented 3 years ago

Hello, I'm testing MARS on some of our videos. The pose estimation and feature extraction steps work, but the classification part could not find the directory below:

Feats frame 37829/37829  100% --  [Elapsed Time: 1:24:50] |##| (Time: 1:24:50) 2)
WinFeats 100% -- Feature 262/262 [Elapsed Time: 0:00:12] |###| (Time: 0:00:12)
[DONE] top 85.82 mins
Thread features done
loading top features
predicting labels
predicting probabilities
[WinError 3] The system cannot find the path specified: 'models/classifier/top_tm_xgb500_wnd'
[WinError 3] The system cannot find the path specified: 'models/classifier/top_tm_xgb500_wnd'
QQueue ended
Thread: Destroyed while thread is still running

I've manually changed the directory name from "top_xgb_wnd" to "top_tm_xgb500_wnd", resulting in the following error:

   Extracting top pose from RI_C00504862_E_rank1_top2020-11-18T18_24_32_Top.avi...
1 - Extracting pose
1 - Pose already extracted. Change your settings to override, if you still want to extract the pose.
2 - Features top already extracted
loading top features
Thread features done
predicting labels
predicting probabilities
############################## closeinvestigation #########################
Transforming features...
The reset parameter is False but there is no n_features_in_ attribute. Is this estimator fitted?
The reset parameter is False but there is no n_features_in_ attribute. Is this estimator fitted?
Queue ended

It also seems that MARS is only using the CPU (at least for pose estimation and feature extraction) - is this expected or are we missing out on performance gains?

Thanks for your help.

annkennedy commented 3 years ago

Hello- sorry for the delay, and thank you for pointing out this issue! This looks like a version issue with the trained classifiers, I will look into it and get back to you shortly.

In the meantime: MARS should definitely be making use of your GPU (assuming your GPU is from Nvidia). This issue can arise if the conda environment is built before Cuda is installed-- if you activate the MARS environment and call conda list you can check the installed packages, you should see both tensorflow and tensorflow-gpu installed (if not, try removing and rebuilding the environment, or installing tensorflow-gpu=1.15 from within the environment.) If tensorflow-gpu is present, try launching python and calling import tensorflow as tf followed by tf.test.is_gpu_available(), which we'd hope would return True. If this returns False, then tensorflow is not detecting your GPU- this can happen if you have an old version of Cuda (eg before 9.0). This is unfortunately a common issue with tensorflow- let me know if any of these suggestions helped, or we can try digging deeper.

annkennedy commented 3 years ago

Hi neven-x: the error you were experiencing was due to a sklearn version issue in the conda environments provided with this repository. I've corrected both the Linux and Windows versions of the environment, and confirmed that code now runs to completion on the test videos posted at https://data.caltech.edu/records/1655. I also fixed the models directory name error you mentioned.

To apply these fixes, please pull the latest version of the master repository, and then delete and re-build the MARS conda environment.

Thank you for catching this error, and please don't hesitate to get in touch if you encounter other issues! (I will note, the MARS option to save tracked videos may still be out of date, I will be checking this code shortly. In the meantime, if you would like to visualize MARS pose estimates and behavior labels, I recommend you load your video, pose_top.mat, and annotation text file into Bento, provided at github.com/annkennedy/bento. Please let me know if this gives you trouble. If pose/behavior classifier performance is poor, as can be the case in videos that don't resemble the MARS training set, note that we will soon be releasing code allowing you to fine-tune the pose/classifiers to your own videos.)

neven-x commented 3 years ago

Hi Ann, thanks for this. Mars seems to be running the classification fine now, I look forward to seeing how it handles our data. I followed your recommendations for checking GPU detection and everything seems to be OK, but still can't see any GPU utilisation in task manager. Currently classification of an approximately 13500 frame video at 640x480 resolution takes about 28 minutes.

tf.test.is_gpu_available() 2021-01-05 16:43:50.158338: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2 2021-01-05 16:43:50.188717: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll 2021-01-05 16:43:50.558429: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: name: Quadro P2000 major: 6 minor: 1 memoryClockRate(GHz): 1.4805 pciBusID: 0000:73:00.0 2021-01-05 16:43:50.571597: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 1 with properties: name: Quadro M4000 major: 5 minor: 2 memoryClockRate(GHz): 0.7725 pciBusID: 0000:17:00.0 2021-01-05 16:43:50.584381: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll 2021-01-05 16:43:50.625005: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll 2021-01-05 16:43:50.701928: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll 2021-01-05 16:43:50.721494: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll 2021-01-05 16:43:50.762493: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll 2021-01-05 16:43:50.812497: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll 2021-01-05 16:43:51.117237: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll 2021-01-05 16:43:51.143108: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0, 1 2021-01-05 16:44:04.488691: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix: 2021-01-05 16:44:04.499719: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0 1 2021-01-05 16:44:04.505658: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N N 2021-01-05 16:44:04.512519: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 1: N N 2021-01-05 16:44:04.575174: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/device:GPU:0 with 3849 MB memory) -> physical GPU (device: 0, name: Quadro P2000, pci bus id: 0000:73:00.0, compute capability: 6.1) 2021-01-05 16:44:04.700033: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/device:GPU:1 with 6445 MB memory) -> physical GPU (device: 1, name: Quadro M4000, pci bus id: 0000:17:00.0, compute capability: 5.2) True

neven-x commented 3 years ago

Hi Ann, just wanted to update you that pose estimation didn't work too well on out data, I attached a clip below. Mostly the white mouse seems more problematic .. could it be due to the quite bright bedding in the cage? It looks like there also seems to be an offset between the pose estimate and the actual location of the animal which I guess doesn't matter if the relative distances are accurate unless estimates disappear if they are shifted out of the FOV. Being able to retrain the pose estimator on our own data would be a great addition.

https://user-images.githubusercontent.com/69473503/104185115-696ca280-540c-11eb-9088-6ccefd2072b0.mp4

annkennedy commented 3 years ago

Thank you for the update! My guess is that the offset in the pose is just an issue of the video saving code- I admit I haven't looked at that in ages and it might be in need of updating. If you have Matlab, I suggest using Bento (https://github.com/neuroethology/bentoMAT) to look at the MARS output. Launch bento by calling bento from the command line, then select "Load Experiment", and navigate to the bento.xls file in the MARS output folder for your video; when Bento prompts for a pose model, specify "MARS_top". You'll be able to browse through your movie+poses and annotations to see how it's performing.

As for the performance, you're right that the bedding is probably throwing it off with the white mouse. My guess is that if we re-train just the white mouse detector, MARS's performance could dramatically improve. We are preparing to release a separate MARS_Developer repo that will include code and jupyter notebooks for collecting manual pose annotations and re-training MARS, hopefully soon! In the meantime, our lab is very interested in getting MARS working out-of-the-box on videos from other setups- if you are willing to share a handful of videos with us, we would be happy to do the re-training of MARS ourselves and share the new detection and pose modes with you. If you're interested, send me an email at ann.kennedy@northwestern.edu and we can make arrangements.