letizialamperti / DLeDNA

New variational autoencoders and deep metric learning based methods improve the dimensionality reduction of environmental DNA metabarcoding data
5 stars 1 forks source link

code #3

Open GGyynnn opened 9 months ago

GGyynnn commented 9 months ago

When I run ENNBetadist, I get an Error "Error in py_call_impl(callable, call_args$unnamed, call_args$named) during training: Matrix type cannot be converted to python (only integer, numeric, complex, logical, and character matrixes can be converted. "What might have gone wrong? For the Telonemia dataset ,Is the result of "dim(m_input_1), dim(m_input_2), dim(genetic_info_1),dim(genetic_info_2) "320x237,321x237, 320x32000, and 321x100?" Because I think the cause of the above problems may be data problems. In addition, R2 is very unstable when I run VAE. What might be the cause? Are you running a 10epoch representative?

GGyynnn commented 7 months ago

The two-dimensional coordinate points generated after VAE dimension reduction are of no practical significance. It only represents the distribution of data on the two-dimensional plane. Due to the randomness in the network, the coordinates of the two-dimensional coordinate points generated during each operation are inconsistent (the data distribution after each operation is roughly consistent, which proves the effectiveness of the network). I saw that when you used the Western Mediterranean eDNA fish dataset to plot the dimension reduction results of the VAESeq network, the coordinate range of V1 was always 0.15 to 0.3, and the coordinate range of V2 was always 0.5 to 0.7. May I ask how this was achieved? In addition, I saw that your legend shows "Log MOTU richness", you processed the MOTU richness logarithmically and then mapped it to colors. Or is the data (for example, the Western Mediterranean eDNA fish dataset) processed logarithmically as a whole? I really hope to get your answer! This will help me a lot!

letizialamperti commented 7 months ago

In the paper we show how the two-dimensional distribution of simple VAE applied to the data does not lead to a meaningful representation. By adding nucleotide sequences processed by an autoencoder, we then obtain results consistent with biodiversity indicators such as alpha diversity of sequences, beta diversity of Jaccard and sequences, and with respect to the logarithm of MOTU richness. The fact that in each image presented in the paper we find the same coordinates along the two axes in the case of VAESeq, refers to the fact that we used the same representation of the data by freezing the parameters and changing only the color value. (e.g. Fig. S5 a and b).

"Log MOTU richness" refers to the fact that we processed the MOTU richness logarithmically and then mapped it to the colors.

GGyynnn commented 4 months ago

Could you please provide the drawing code in Figure 3?