Thesis Repo
This project is just a repository to store changes made to my dissertation in the ENES UNAM Morelia, Mexico.
Document Structure
-
Title Page
-
Table of Contents
-
Acknowledgements
-
Abstract
-
Preface
-
Chapter 0 (Introduction)
- The Cocktail Party Problem
- Historic Background
- Segmentation vs Attention Problem
- Inverse Problems
- Ill-Posed Problems
- Constraints
- Structure of Document
Part 1 (Literature Review)
-
Chapter 1 (Data Processing)
- Feature Extraction
- Spectrograms
- Mel-Bin
- Normalization
-
Chapter 2 (The Generation Problem)
- Time Series
- Wave-net
- Phase Loss
- Phase Storage
- Griffin-lin
- Vocoders
- Masking Techniques
-
Chapter 3 (Looking to Listen)
-
Chapter 4 (Music-Speech)
Part 2 (Methodology)
-
Chapter 5 (Libraries)
- Librosa
- PyTorch
- TorchAudio
-
Chapter 6 (Implementation)
- Dataset
- Model Structure
- Preprocessing
- Targets (Tested)
Part 3
- Chapter 7 (Results)
- Results
- Quantified Losses
- Quality Tests
- Samples (images and audios)
- Transfer Learning
- Quantified Losses
- Quality Tests
- Samples (images and audios)
- Transfer Learning with fine tuning
- Quantified Losses
- Quality Tests
- Samples (images and audios)
- Chapter 8 (Discussion)
Appendix
-
Time Series
- Signal Processing
- Audio/Speech Properties
-
Transforms
- Fourier Transform
- Spectrums
- Discrete Fourier Transform
- Short-Time Fourier Transform
- Wavelet
- Scaleograph
-
Spectrograms
- Linear and Log
- Amplitude vs Decibels
- Mel-Bins
- The Phase Problem
- Linear
- Decibel
- Mel-Bin
- Phase Retrieval Techniques
- Phase Storage
- Griffin-lin Algorithm
- Vocoders
-
Machine Learning
- Supervised Learning vs Unsupervised Learning
- Advancements
- Problems
- Data
- Interpretability
- Overfitting
- Hyper-parameters
- Computation
-
Deep Learning
- Neural Networks
- Structure
- Back-propagation Algorithm
- Forward Propagation
- Gradients
- Image processing
- Convolutional Neural Networks
- Attention
- Segmentation
- Medicine
- U-Net
-
Source Code