-
This is an umbrella issue to track progress and discuss priority items. Comments and requests are always welcome.
## MIlestones
- [x] ~ 4/26 (Sun): Refactor my jupyter-based code to python scrip…
-
Hi,
Could you explain a little more about how reshape_wav2img function works?
Why could we simply interpolate T and F to spec_size? What is target_T and target_F represent here?
A detailed ex…
-
Hi, thanks for the interesting work!
I have a question about the infer mode in htsat.py. When training, the length of audio input will always be 10 seconds. When inference, the model needs to hand…
-
### News
- Conference
- ICLR 2023: 르완다! (케냐, 탄자니아 서쪽, 콩고민주공화국 동쪽)
![image](https://user-images.githubusercontent.com/11782739/166086698-b848f6aa-c5fb-48fe-b337-76ee61b5d110.png)
- [Hyperscale…
-
rt
-
Hello,
Thanks for creating such an interesting paper! This adaptation of an off-the-shelf Vision Transformer to a spectrogram transformer is truly fascinating, backed by great results!
I was won…
-
Hi, thank you for sharing this!
I'm trying to use the HTSAT for SED with strong labels, i.e. with known onset and offset times. I have found that with the default config, the input shape is `(batch…
-
#20
Hi, thanks for the reply to the last issue. Now I can make the training script run, but the loss seems to be very high. I will attach the code and result below:
```
import pyroomacoustics as p…
-
Hi @kan-bayashi
A general question regarding the LibriTTS vocoders -- I am using them (melgan/hifigan/parallel_wavegan) to decode the spectrograms generated from LibriTTS Tacotron2/Transformer-TTS…
-
### News
- 컨퍼런스
- ICLR 2022:
- 네이버 클로바 발표 스케쥴: https://naver-career.gitbook.io/en/teams/clova-cic/events/naver-clova-iclr-2022
- ML in Korea: https://naver-career.gitbook.io/en/teams/cl…