Open bhack opened 4 years ago
@MohsenFayyaz89 Do you plan to contribute your dateset?
Would like to but unfortunately, I'm quite busy.
Too bad, it would have been useful for those who want to use your dataset but are too busy to create the adapter classes :stuck_out_tongue_winking_eye:
/cc @raviddoss do you know anyone internal that could be interested to take in charge this?
@bhack I would like to contribute to this issue. Please assign this to me.Also, since I am new to this repo, please guide me through.
@Naman-Bhrgv Please start with standard guide https://www.github.com/tensorflow/datasets/tree/master/docs%2Fadd_dataset.md
@bhack Thanks!
@Naman-Bhrgv Are you currently working on this? @bhack I would like to take this up. Could you provide me with the current status?
@NikhilBartwal I am currently not working on this.If you want you can take up this issue.
Ok. @NikhilBartwal do you want to prepare a PR?
@bhack I would very much like to. As I'm new to this, I wanted to know if you could guide me through the process as the dataset rather contains YouTube video IDs, so I'm not sure about how to prepare the dataset after downloading the video IDs in TFDS. Could you help me through ? Thanks!
For the video set you can take an overview to: https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/video
@bhack Thanks for the help! I will look into some examples and start working on it.
Hey @bhack @ChanchalKumarMaji @Conchylicultor , I was trying out this script https://github.com/holistic-video-understanding/HVU-Downloader/blob/master/HVU_download.py and what i found was that no matter if I used the script directly from terminal for downloading or integrated the script code in my notebook, there were always sosme files which were not downloaded properly (around ~10% and were around 252 bytes). The script uses joblib.parallel for parallelisation, What do you think could be causing that?
@NikhilBartwal Have you tried to open a ticket at https://github.com/holistic-video-understanding/HVU-Downloader/?
@bhack I have just opened one at https://github.com/holistic-video-understanding/HVU-Dataset/issues/3
@bhack Hey, there was one doubt that I was having, what do you think would be the most efficient way of decoding the video to a numpy array inside a script?
@bhack Guess it was a typo :( I will have a look at it. Thanks !
@bhack Well i tried it and looks like tensorflow-io only supports FFmpeg on Ubuntu 14.04, 16.04, and 18.04. as mentioned here Do you have any other idea?
I think that Tensorflow IO is what the other video datasets are using in this repo /cc @yongtang
@bhack I checked the video datasets and unfortunately, I couldn't find them using tfio. Could you give a link to it?
Well i tried it and looks like tensorflow-io only supports FFmpeg on Ubuntu 14.04, 16.04, and 18.04. as mentioned here Do you have any other idea?
@NikhilBartwal Do you have a specific platform (such as Ubuntu 20.04?) you want to use?
@Naman-Bhrgv I supposed that they was reusing TF-IO API but it seems that they are controlling ffmpeg
subprocess: https://github.com/tensorflow/datasets/blob/master/tensorflow_datasets/core/features/video_feature.py#L116
@yongtang I was working on a video dataset for TF-DS and so i was hoping for a platform independent way of decoding a video file.
@bhack I will have a look at it. Thanks!
@bhack @Conchylicultor Could you review the PR?
@NikhilBartwal thank you for the PR. I am on vacation. It could be nice if a TFDS maintainer could do a first pass in the meantime.
@bhack I didn't know that. Sorry for the disturbance :(
Folks who would also like to see this dataset in
tensorflow/datasets
, please thumbs-up so the developers can know which requests to prioritize.And if you'd like to contribute the dataset (thank you!), see our guide to adding a dataset.
/cc @alidiba67 @MohsenFayyaz89