TanUkkii007 papers-i-read issues

TanUkkii007 / papers-i-read

23 stars 3 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

A Cookbook of Self-Supervised Learning

#776 TanUkkii007 opened 1 year ago
0
A Survey of Large Language Models

#775 TanUkkii007 opened 1 year ago
0
Adversarial Feature Learning and Unsupervised Clustering based Speech Synthesis for Found Data with Acoustic and Textual Noise

#774 TanUkkii007 opened 1 year ago
0
Speak, Read and Prompt: High-Fidelity Text-to-Speech with Minimal Supervision

#773 TanUkkii007 opened 1 year ago
0
SQ-VAE: Variational Bayes on Discrete Representation with Self-annealed Stochastic Quantization

#772 TanUkkii007 opened 2 years ago
0
Self-supervised Context-aware Style Representation for Expressive Speech Synthesis

#771 TanUkkii007 opened 2 years ago
0
Beyond Separability: Analyzing the Linear Transferability of Contrastive Representations to Related Subpopulations

#770 TanUkkii007 opened 2 years ago
0
Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech

#769 TanUkkii007 opened 2 years ago
0
Elucidating the Design Space of Diffusion-Based Generative Models

#768 TanUkkii007 opened 2 years ago
0
Classifier-Free Diffusion Guidance

#767 TanUkkii007 opened 2 years ago
0
Diffusion Autoencoders: Toward a Meaningful and Decodable Representation

#766 TanUkkii007 opened 2 years ago
0
Zero-Shot Voice Conditioning for Denoising Diffusion TTS Models

#765 TanUkkii007 opened 2 years ago
0
GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models

#764 TanUkkii007 closed 2 years ago
0
Improved Denoising Diffusion Probabilistic Models

#763 TanUkkii007 closed 2 years ago
0
Image Super-Resolution via Iterative Refinement

#762 TanUkkii007 closed 2 years ago
0
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality

#761 TanUkkii007 opened 2 years ago
0
VoiceGrad: Non-Parallel Any-to-Many Voice Conversion with Annealed Langevin Dynamics

#760 TanUkkii007 opened 2 years ago
0
SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral Shaping

#759 TanUkkii007 opened 2 years ago
0
BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis

#758 TanUkkii007 opened 2 years ago
0
vTTS: visual-text to speech

#757 TanUkkii007 closed 2 years ago
0
Creative Language Retrieval: A Robust Hybrid of Information Retrieval and Linguistic Creativity

#756 TanUkkii007 opened 2 years ago
0
A Dual Reinforcement Learning Framework for Unsupervised Text Style Transfer

#755 TanUkkii007 opened 2 years ago
0
Encode, Tag, Realize: High-Precision Text Editing

#754 TanUkkii007 closed 2 years ago
0
DopeLearning: A Computational Approach to Rap Lyrics Generation

#753 TanUkkii007 closed 2 years ago
0
Conditional LSTM-GAN for Melody Generation from Lyrics

#752 TanUkkii007 opened 2 years ago
0
Generation of Hip-Hop Lyrics with Hierarchical Modeling and Conditional Templates

#751 TanUkkii007 opened 2 years ago
0
Rapformer: Conditional Rap Lyrics Generation with Denoising Autoencoders

#750 TanUkkii007 closed 2 years ago
0
ラップバトルにおけるライムの意味類似性を考慮したバース生成システム

#749 TanUkkii007 closed 2 years ago
0
AI Song Contest: Human-AI Co-Creation in Songwriting

#748 TanUkkii007 opened 2 years ago
0
Progressive Distillation for Fast Sampling of Diffusion Models

#747 TanUkkii007 opened 2 years ago
0
The Perils of Using Mechanical Turk to Evaluate Open-Ended Text Generation

#746 TanUkkii007 opened 2 years ago
0
DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs

#745 TanUkkii007 opened 2 years ago
0
MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical Modeling

#744 TanUkkii007 opened 2 years ago
0
Data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language

#743 TanUkkii007 closed 2 years ago
0
DDS: A new device-degraded speech dataset for speech enhancement

#742 TanUkkii007 closed 2 years ago
0
Data Augmentation Approaches in Natural Language Processing: A Survey

#741 TanUkkii007 opened 2 years ago
0
End-to-End Text-to-Speech Synthesis with Unaligned Multiple Language Units Based on Attention

#740 TanUkkii007 closed 2 years ago
0
Towards Learning Universal Audio Representations

#739 TanUkkii007 opened 3 years ago
0
Speech Resynthesis from Discrete Disentangled Self-Supervised Representations

#738 TanUkkii007 opened 3 years ago
0
T5G2P: Using Text-to-Text Transfer Transformer for Grapheme-to-Phoneme Conversion

#737 TanUkkii007 closed 3 years ago
0
Phrase Break Prediction with Bidirectional Encoder Representations in Japanese Text-to-Speech Synthesis

#736 TanUkkii007 closed 3 years ago
0
Pre-Training for Spoken Language Understanding with Joint Textual and Phonetic Representation Learning

#735 TanUkkii007 closed 3 years ago
0
Improving Prosody with Linguistic and Bert Derived Features in Multi-Speaker Based Mandarin Chinese Neural TTS

#734 TanUkkii007 opened 3 years ago
0
Pre-Trained Text Representations for Improving Front-End Text Processing in Mandarin Text-to-Speech Synthesis

#733 TanUkkii007 opened 3 years ago
0
Improving the Prosody of RNN-Based English Text-To-Speech Synthesis by Incorporating a BERT Model

#732 TanUkkii007 closed 3 years ago
0
Audiobook Speech Synthesis Conditioned by Cross-Sentence Context-Aware Word Embeddings

#731 TanUkkii007 closed 3 years ago
0
Exploiting Syntactic Features in a Parsed Tree to Improve End-to-End TTS

#730 TanUkkii007 opened 3 years ago
0
Prosodic Representation Learning and Contextual Sampling for Neural Text-to-Speech

#729 TanUkkii007 closed 3 years ago
0
Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Text Data

#728 TanUkkii007 opened 3 years ago
0
Improving Prosody Modelling with Cross-Utterance Bert Embeddings for End-to-End Speech Synthesis

#727 TanUkkii007 closed 3 years ago
0