Closed EpiSlim closed 2 years ago
Hi Lofti,
inputsize
should indeed be the number of SNPs. GenNet checks the folder for .h5 files and takes the second dimension as inputsize
. During conversion GenNet transposes the plink
files. My suspicion is that GenNet opened an intermediate file that was not transposed yet. Do you have multiple .h5 files in the input folder?
You can delete the other .h5 files that are created except for the genotype.h5
.
Best,
Arno
Hey Arno! Out of the same plink
file, three intermediate .h5
files were created in the genotype
subfolder: 0_plink.h5
, 1_plink.h5
and 2_plink.h5
.
What changes need to be made so that the final genotype.h5
has the correct dimensions?
Cheers, Lotfi
Hi @EpiSlim ,
There should be a file called genotype.h5
in the GenNet_data/input
, you can delete the other .h5 files.
If it is not there it could be in the processed_data folder but looking at your command it should be in the GenNet_data/input
.
Best,
Arno
OK, do you mean that GenNet is currently using one of the other intermediate files in the training stage?
I am not 100% certain, I would need to see more for that. My suspicion is that there are lingering files. In your input folder / datapath should only be:
Issue fixed. Thanks!
Hey Arno!
I am trying to use
GenNet
on a dataset that I converted from theplink
format as followspython
GenNet/GenNet.py convert -step all -g GenNet_data/plink -study_name plink -o GenNet_data/input`.The conversion stage processes without errors and the log includes the right dimensionality of my toy dataset (500 individuals and 60708 SNPs). However, when I proceed to the training stage, the following assertion is failing https://github.com/ArnovanHilten/GenNet/blob/fbed86365f37549505bbda13227e1a34a301327f/GenNet_utils/Create_network.py#L234
The reason why the above assertion is failing is that
inputsize
evaluates to500
(number of individuals) whilemask_shapes_x[0]
evaluates to60708
(number of SNPs).Is my understanding correct that
inputsize
should also be the number of SNPs? If so, where is the issue given the fact that the conversion log correctly displays the number of individuals and number of variants?Many thanks, Lotfi