-
### Is there an existing issue for this?
- [X] I have searched the existing issues
### Is your feature request related to a problem? Please describe.
Short audio files cause error in panns inferenc…
-
Hi Vladimir,
Long time no talk :) I was wondering if you can share the code that converted the .npy features (from your [VGGish and I3D feature extractor](https://github.com/v-iashin/MDVC#raw-data-…
-
Dear Valentin,
first of all thank you for sharing your code, I think it is a pretty powerful gated model and you have good results, congrats on that!
I would like to ask you about features_t.s3d. …
-
## ❓Question
Hi all
I am trying to use CoreML as a feature extractor (from sound analysis). I've split up by model as I said in issue #589 however when I try to get Xcode to start the model I rec…
-
@plakal and @dpwe,
The TFRecord embeddings have a range of [-128, +127]. vggish_postprocess.py produces embeddings with range [0, 255]
-
```
public void test(View view) {
Module module = null;
Log.i("test", "test: load pt fail!");
try {
// creating bitmap from packaged into app android asset 'im…
-
Hi @ttengwang ~
Thanks for the sharing of your wonderful work! I want to caption my custom video, but unfortunately I find that most codes for captioning are starting from extracted features, and lit…
-
Hi ~ @v-iashin ,
Thanks for sharing your wonderful work and the detailed instructions on the usage!! I want to caption my own video with the provided pre-trained model. But the video doesn't has an a…
-
When i am running this command -
python ./sample/single_video_prediction.py \
> --prop_generator_model_path ./sample/best_prop_model.pt \
> --pretrained_cap_model_path ./sample/best_cap_…
-
I am a first learner of YAMNet and VGGish.
When I use YAMNet/inference.py to predict different audios, I find out that all duration of audios(larger than 975ms) can be computed into features and …