audio-spectrogram-transformer Search Results

251 results
for audio-spectrogram-transformer

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

nnsvs/nnsvs #1

Implementation status and planned TODOs

This is an umbrella issue to track progress and discuss priority items. Comments and requests are always welcome. ## MIlestones - [x] ~ 4/26 (Sun): Refactor my jupyter-based code to python scrip…

r9y9 updated 1 year ago
37
RetroCirce/HTS-Audio-Transformer #15

Question about reshape log_mel to img size

Hi, Could you explain a little more about how reshape_wav2img function works? Why could we simply interpolate T and F to spec_size? What is target_T and target_F represent here? A detailed ex…

lyc1993 updated 2 years ago
1
RetroCirce/HTS-Audio-Transformer #7

Different length audio input for infer mode

Hi, thanks for the interesting work! I have a question about the infer mode in htsat.py. When training, the length of audio input will always be 10 seconds. When inference, the model needs to hand…

CaptainPrice12 updated 2 years ago
3
jungwoo-ha/WeeklyArxivTalk #50

[20220501] Weekly AI ArXiv 만담

### News - Conference - ICLR 2023: 르완다! (케냐, 탄자니아 서쪽, 콩고민주공화국 동쪽) ![image](https://user-images.githubusercontent.com/11782739/166086698-b848f6aa-c5fb-48fe-b337-76ee61b5d110.png) - [Hyperscale…

jungwoo-ha updated 2 years ago
4
thsno02/StarGANv2-VC #1

what is mel-spectrogram?

rt

thsno02 updated 2 years ago
1
YuanGongND/ast #61

Question Regarding Activation on MLP Head

Hello, Thanks for creating such an interesting paper! This adaptation of an off-the-shelf Vision Transformer to a spectrogram transformer is truly fascinating, backed by great results! I was won…

arshinmar updated 2 years ago
1
RetroCirce/HTS-Audio-Transformer #4

Question regarding framewise output timesteps

Hi, thank you for sharing this! I'm trying to use the HTSAT for SED with strong labels, i.e. with known onset and offset times. I have found that with the default config, the input shape is `(batch…

leanderme updated 2 years ago
1
mesolitica/malaya-speech #21

Retraining from output-large-singlish-conformer checkpoint

#20 Hi, thanks for the reply to the last issue. Now I can make the training script run, but the loss seems to be very high. I will attach the code and result below: ``` import pyroomacoustics as p…

mr-coconut updated 2 years ago
4
kan-bayashi/ParallelWaveGAN #340

LibriTTS vocoders

Hi @kan-bayashi A general question regarding the LibriTTS vocoders -- I am using them (melgan/hifigan/parallel_wavegan) to decode the spectrograms generated from LibriTTS Tacotron2/Transformer-TTS…

jefflai108 updated 2 years ago
13
jungwoo-ha/WeeklyArxivTalk #49

[20220424] Weekly AI ArXiv 만담 - 49회차

### News - 컨퍼런스 - ICLR 2022: - 네이버 클로바 발표 스케쥴: https://naver-career.gitbook.io/en/teams/clova-cic/events/naver-clova-iclr-2022 - ML in Korea: https://naver-career.gitbook.io/en/teams/cl…

jungwoo-ha updated 2 years ago
4

上一页 1...19 20 21 22 23 24 25...26 下一页

251 results for audio-spectrogram-transformer

251 results
for audio-spectrogram-transformer