Estimating Phylogenetic trees using six microorganisms 16S rRNA gene with Unsupervised Learning, web based tools and Molecular Evolutionary Genetics Analysis MEGA7
Consider using DictVectorizer and semi-supervised learning to see if any generalizations arise from using a neural network. Review contrastive loss and ideas here #44
The dictVectorizer will not work so well. We have variable lengths of the sequences. Therefore, embeddings have an argument padding in order to make the sequences of the same length.
This workflow allows use to make a representation of the data with dictionary structure that is, an embedding. Which we can use for the semi-supervised or unsupervised methods. Which we can use loss functions like contrastive loss to examine similarity and differences.
The dictVectorizer will not work so well. We have variable lengths of the sequences. Therefore, embeddings have an argument padding in order to make the sequences of the same length.
This workflow allows use to make a representation of the data with dictionary structure that is, an embedding. Which we can use for the semi-supervised or unsupervised methods. Which we can use loss functions like contrastive loss to examine similarity and differences.