Here are the steps I take to reproduce the results described in the section "Motif dependent embeddings using simulation data":
Running the scripts/multiple.py script with default parameters
The following test loss values were observed:
ELBO: 23.95
Reconstruction error: 19.22
KL Divergence: 4.95
To reproduce the plot, I took the following steps (as there is no script available in the repository)
took the file out/seqences.txt resulting from running scripts/multiple.py and
sampled a fasta file from it
The sampled fasta file is used as input for the scripts/encode.py script to create latent embeddings with the model created by scripts/multiple.py.
Using the out/embed.seq and out/sequences.txt files created by scripts/encode.py, the latent embeddings are plotted with their corresponding motif (colour).
These values are quite close to those reported in the paper (20.60, 16.02, 4.59 respectively).
The resulting plot also looks very similar to Fig.2b (HMM profile).
I repeated both described experiments (enabled force_matching & disabled force_matching) with different seeds.
Could you clarify which parameters you were using during training of the CNN_PHMM_VAE model?
Here are the steps I take to reproduce the results described in the section "Motif dependent embeddings using simulation data": Running the
scripts/multiple.py
script with default parameters The following test loss values were observed:To reproduce the plot, I took the following steps (as there is no script available in the repository)
scripts/multiple.py
andscripts/encode.py
script to create latent embeddings with the model created byscripts/multiple.py
.out/embed.seq
andout/sequences.txt
files created by scripts/encode.py, the latent embeddings are plotted with their corresponding motif (colour).This results in the following plot:
However, if I switch off the
force_matching
option (see https://github.com/Xilorole/raptgen/blob/c4986ca9fa439b9389916c05829da4ff9c30d6f3/scripts/multiple.py#L84), I observe the following test loss values after running scripts/multiple.py with default parameters:These values are quite close to those reported in the paper (20.60, 16.02, 4.59 respectively). The resulting plot also looks very similar to Fig.2b (HMM profile).
I repeated both described experiments (enabled
force_matching
& disabledforce_matching
) with different seeds.Could you clarify which parameters you were using during training of the CNN_PHMM_VAE model?