issues
search
TanUkkii007
/
papers-i-read
23
stars
3
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
SESQA: Semi-Supervised Learning for Speech Quality Assessment
#726
TanUkkii007
opened
2 years ago
0
NORESQA: A Framework for Speech Quality Assessment using Non-Matching References
#725
TanUkkii007
opened
2 years ago
0
Lossy Image Compression with Compressive Autoencoders
#724
TanUkkii007
opened
2 years ago
0
End-to-end Optimized Image Compression
#723
TanUkkii007
opened
2 years ago
0
Neural Sequence-to-Sequence Speech Synthesis Using a Hidden Semi-Markov Model Based Structured Attention Mechanism
#722
TanUkkii007
opened
2 years ago
0
Neural HMMs are all you need (for high-quality attention-free TTS)
#721
TanUkkii007
opened
2 years ago
0
DenoiSpeech: Denoising Text to Speech with Frame-Level Noise Modeling
#720
TanUkkii007
opened
2 years ago
0
MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis
#719
TanUkkii007
opened
2 years ago
0
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
#718
TanUkkii007
closed
2 years ago
0
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing
#717
TanUkkii007
opened
2 years ago
0
CLUB: A Contrastive Log-ratio Upper Bound of Mutual Information
#716
TanUkkii007
closed
3 years ago
0
Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
#715
TanUkkii007
closed
2 years ago
0
Differentiable Signal Processing With Black-Box Audio Effects
#714
TanUkkii007
closed
3 years ago
0
SPEECH BERT EMBEDDING FOR IMPROVING PROSODY IN NEURAL TTS
#713
TanUkkii007
closed
3 years ago
0
DISENTANGLED SPEAKER AND LANGUAGE REPRESENTATIONS USING MUTUAL INFORMATION MINIMIZATION AND DOMAIN ADAPTATION FOR CROSS-LINGUAL TTS
#712
TanUkkii007
closed
3 years ago
0
A UNIVERSAL BERT-BASED FRONT-END MODEL FOR MANDARIN TEXT-TO-SPEECH SYNTHESIS
#711
TanUkkii007
closed
3 years ago
0
CONTEXT-AWARE PROSODY CORRECTION FOR TEXT-BASED SPEECH EDITING
#710
TanUkkii007
closed
3 years ago
0
PATNET : A PHONEME-LEVEL AUTOREGRESSIVE TRANSFORMER NETWORK FOR SPEECH SYNTHESIS
#709
TanUkkii007
closed
3 years ago
0
A Comparison of Discrete Latent Variable Models for Speech Representation Learning
#708
TanUkkii007
closed
3 years ago
0
MAPGN: MAsked Pointer-Generator Network for Sequence-to-Sequence Pre-training
#707
TanUkkii007
closed
3 years ago
0
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units
#706
TanUkkii007
closed
2 years ago
0
Deep Self-Learning From Noisy Labels
#705
TanUkkii007
opened
3 years ago
0
SpeechNet: A Universal Modularized Model for Speech Processing Tasks
#704
TanUkkii007
opened
3 years ago
0
Unsupervised Speech Recognition
#703
TanUkkii007
opened
3 years ago
0
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech
#702
TanUkkii007
closed
2 years ago
0
Evaluation of Text Generation: A Survey
#701
TanUkkii007
opened
3 years ago
0
Generative Modeling by Estimating Gradients of the Data Distribution
#700
TanUkkii007
closed
3 years ago
0
Score-Based Generative Modeling through Stochastic Differential Equations
#699
TanUkkii007
opened
3 years ago
0
Denoising Diffusion Probabilistic Models
#698
TanUkkii007
closed
3 years ago
0
Deep Unsupervised Learning using Nonequilibrium Thermodynamics
#697
TanUkkii007
closed
3 years ago
0
Symbolic Music Generation with Diffusion Models
#696
TanUkkii007
closed
3 years ago
0
Diff-TTS: A Denoising Diffusion Model for Text-to-Speech
#695
TanUkkii007
closed
3 years ago
0
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
#694
TanUkkii007
opened
3 years ago
0
VideoGPT: Video Generation using VQ-VAE and Transformers
#693
TanUkkii007
opened
3 years ago
0
Review of end-to-end speech synthesis technology based on deep learning
#692
TanUkkii007
opened
3 years ago
0
Do Transformer Modifications Transfer Across Implementations and Applications?
#691
TanUkkii007
closed
3 years ago
0
CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language Representation
#690
TanUkkii007
opened
3 years ago
0
Efficient Transformers: A Survey
#689
TanUkkii007
opened
3 years ago
0
Accent Estimation of Japanese Words from Their Surfaces and Romanizations for Building Large Vocabulary Accent Dictionaries
#688
TanUkkii007
closed
3 years ago
0
Self-Supervised Representation Learning from Flow Equivariance
#687
TanUkkii007
opened
3 years ago
0
Learning Speech-driven 3D Conversational Gestures from Video
#686
TanUkkii007
opened
3 years ago
0
Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review
#685
TanUkkii007
opened
3 years ago
0
Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
#684
TanUkkii007
closed
3 years ago
0
PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS
#683
TanUkkii007
closed
3 years ago
0
Variable-rate discrete representation learning
#682
TanUkkii007
closed
2 years ago
0
Perceiver: General Perception with Iterative Attention
#681
TanUkkii007
opened
3 years ago
0
Voice Conversion Challenge 2020 –- Intra-lingual semi-parallel and cross-lingual voice conversion –-
#680
TanUkkii007
opened
3 years ago
0
Semi-Supervised Spoken Language Understanding via Self-Supervised Speech and Language Model Pretraining
#679
TanUkkii007
opened
3 years ago
0
Transformers in Vision: A Survey
#678
TanUkkii007
closed
3 years ago
0
DeepBach: a Steerable Model for Bach Chorales Generation
#677
TanUkkii007
opened
3 years ago
0
Previous
Next