ArnovanHilten / GenNet

Framework for Interpretable Neural Networks
Apache License 2.0
91 stars 14 forks source link

@ArnovanHilten error message for regression mode #91

Closed lesyngenta closed 1 year ago

lesyngenta commented 1 year ago

Hello Arno,

I prepared three datasets: genotype.h5, subjects.csv and topology.csv, and then run the command as this: python GenNet.py train -path (path for train) -ID 1 -epochs 50 -problem_type regression

I got error message as below: no slurm id number of covariates: 0 Covariate columns found: [] mode is regression Traceback (most recent call last): File "GenNet.py", line 296, in main() File "GenNet.py", line 22, in main train_regression(args) File "/scratch-large/4-quarterly/s1198162/GenNet/GenNet_utils/Train_network.py", line 273, in train_regression inputsize = get_inputsize(genotype_path) File "/scratch-large/4-quarterly/s1198162/GenNet/GenNet_utils/Dataloader.py", line 83, in get_inputsize inputsize = h5file.root.data.shape[1] File "/SD5/people/s1198162/miniforge3/envs/GenNet/lib/python3.7/site-packages/tables/group.py", line 798, in getattr return self._f_get_child(name) File "/SD5/people/s1198162/miniforge3/envs/GenNet/lib/python3.7/site-packages/tables/group.py", line 682, in _f_get_child self._g_check_has_child(childname) File "/SD5/people/s1198162/miniforge3/envs/GenNet/lib/python3.7/site-packages/tables/group.py", line 377, in _g_check_has_child % (self._v_pathname, name)) tables.exceptions.NoSuchNodeError: group / does not have a child named data Closing remaining open files:/scratch-large/4-quarterly/s1198162/GenNet/train/genotype.h5...done

Now I have no idea if the problem is related to linux system or any of the files I used, and how to address that?

Thanks!

ArnovanHilten commented 1 year ago

Hi @lesyngenta

It seems that your genotype.h5 is corrupted. Can you do the following steps?

  1. activate the GenNet environment (e.g source ~/env_GenNet/bin/activate)
  2. navigate to the folder: cd /scratch-large/4-quarterly/s1198162/GenNet/train/genotype.h5 (check if the path is correct, also in your GenNet command)
  3. type python to enter python on command line
  4. import tables
  5. f = tables.open_file("genotype.h5")
  6. if this gives an error than the file is corrupted.
  7. if not, check f.root, f.root.data and f.root.data.shape

Please let me know where the error pops up. If your file is corrupted you need to convert the genotype file again. (if you kept the intermediate files you can continue from those (check if they are fine in the same way, the shape needs to correspond to your number of subjects and snps).

lesyngenta commented 1 year ago

I think I got the reason. The error was due to genotype.h5. When I tried to convert the VCF file again, it showed: There is no genotype data converted! Time to convert all data: 3329.7319617271423 sec /scratch-large/4-quarterly/s1198162/GenNet/corn/ <class 'str'> Traceback (most recent call last): File "GenNet.py", line 296, in main() File "GenNet.py", line 28, in main convert(args) File "/scratch-large/4-quarterly/s1198162/GenNet/GenNet_utils/Convert.py", line 454, in convert merge_hdf5_hase(args) File "/scratch-large/4-quarterly/s1198162/GenNet/GenNet_utils/Convert.py", line 63, in merge_hdf5_hase g = h5py.File(filepath_hase.format(0), 'r')['genotype'] File "/SD5/people/s1198162/miniforge3/envs/GenNet/lib/python3.7/site-packages/h5py/_hl/files.py", line 408, in init swmr=swmr) File "/SD5/people/s1198162/miniforge3/envs/GenNet/lib/python3.7/site-packages/h5py/_hl/files.py", line 173, in make_fid fid = h5f.open(name, flags, fapl=fapl) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py/h5f.pyx", line 88, in h5py.h5f.open OSError: Unable to open file (unable to open file: name = '/scratch-large/4-quarterly/s1198162/GenNet/corn//genotype/0_corn.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0) Closing remaining open files:/scratch-large/4-quarterly/s1198162/GenNet/corn/probes/corn.h5...done

lesyngenta commented 1 year ago

Dear Arno,

Thanks so much for prompt reply! Yes, I figured out there might be sth wrong with genotype.h5. When I converted the VCF file, it showed: There is no genotype data converted! Time to convert all data: 3329.7319617271423 sec /scratch-large/4-quarterly/s1198162/GenNet/corn/ <class 'str'> Traceback (most recent call last): File "GenNet.py", line 296, in main() File "GenNet.py", line 28, in main convert(args) File "/scratch-large/4-quarterly/s1198162/GenNet/GenNet_utils/Convert.py", line 454, in convert merge_hdf5_hase(args) File "/scratch-large/4-quarterly/s1198162/GenNet/GenNet_utils/Convert.py", line 63, in merge_hdf5_hase g = h5py.File(filepath_hase.format(0), 'r')['genotype'] File "/SD5/people/s1198162/miniforge3/envs/GenNet/lib/python3.7/site-packages/h5py/_hl/files.py", line 408, in init swmr=swmr) File "/SD5/people/s1198162/miniforge3/envs/GenNet/lib/python3.7/site-packages/h5py/_hl/files.py", line 173, in make_fid fid = h5f.open(name, flags, fapl=fapl) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py/h5f.pyx", line 88, in h5py.h5f.open OSError: Unable to open file (unable to open file: name = '/scratch-large/4-quarterly/s1198162/GenNet/corn//genotype/0_corn.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0) Closing remaining open files:/scratch-large/4-quarterly/s1198162/GenNet/corn/probes/corn.h5...done

What does the error mean?

Thanks so much! Le

From: Arno van Hilten @.> Sent: 2023年9月26日 15:37 To: ArnovanHilten/GenNet @.> Cc: LV Le CNBC @.>; Mention @.> Subject: Re: [ArnovanHilten/GenNet] @ArnovanHilten error message for regression mode (Issue #91)

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.


Hi @lesyngentahttps://github.com/lesyngenta

It seems that your genotype.h5 is corrupted. Can you do the following steps?

  1. activate the GenNet environment (e.g source ~/env_GenNet/bin/activate)
  2. navigate to the folder: cd /scratch-large/4-quarterly/s1198162/GenNet/train/genotype.h5 (check if the path is correct, also in your GenNet command)
  3. type python to enter python on command line
  4. import tables
  5. f = tables.open_file("genotype.h5")
  6. if this gives an error than the file is corrupted.
  7. if not, check f.root, f.root.datahttp://f.root.data and f.root.data.shape

Please let me know where the error pops up. If your file is corrupted you need to convert the genotype file again. (if you kept the intermediate files you can keep those (check if they are fine in the same way, the shape needs to correspond to your number of subjects and snps).

— Reply to this email directly, view it on GitHubhttps://github.com/ArnovanHilten/GenNet/issues/91#issuecomment-1734992793, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BCZZJTL6D4LTASQVMXHSJLDX4KAZLANCNFSM6AAAAAA5F6ROXI. You are receiving this because you were mentioned.Message ID: @.**@.>>

This message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited.

ArnovanHilten commented 1 year ago

Yes that seems to be the issue. Are you using windows? It is a bit unusual that the files: is printed in (Closing remaining open files:/scratch-large/4-quarterly/s1198162/GenNet/corn/probes/corn.h5...done)

did you provide the link like this: /scratch-large/4-quarterly/s1198162/GenNet/corn/probes/?

ArnovanHilten commented 1 year ago

The error seems simple it cannot find the file: /scratch-large/4-quarterly/s1198162/GenNet/corn//genotype/0_corn.h5 can you check this folder?

lesyngenta commented 1 year ago

I used a linux Slurm cluster and the command was: python GenNet.py convert -g /scratch-large/4-quarterly/s1198162/GenNet/vcf -o /scratch-large/4-quarterly/s1198162/GenNet/corn -study_name corn -vcf

From: Arno van Hilten @.> Sent: 2023年9月26日 15:40 To: ArnovanHilten/GenNet @.> Cc: LV Le CNBC @.>; Mention @.> Subject: Re: [ArnovanHilten/GenNet] @ArnovanHilten error message for regression mode (Issue #91)

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.


Yes that seems to be the issue. Are you using windows? It is a bit unusual that the files: is printed in (Closing remaining open files:/scratch-large/4-quarterly/s1198162/GenNet/corn/probes/corn.h5...done)

did you provide the link like this: /scratch-large/4-quarterly/s1198162/GenNet/corn/probes/?

— Reply to this email directly, view it on GitHubhttps://github.com/ArnovanHilten/GenNet/issues/91#issuecomment-1734997431, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BCZZJTI2ZIR6LWZJKHG4MCTX4KBGDANCNFSM6AAAAAA5F6ROXI. You are receiving this because you were mentioned.Message ID: @.**@.>>

This message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited.

lesyngenta commented 1 year ago

Where to get the 0_corn.h5? I thought it was created during converting. Not the case? Shall I prepare it by myself? There is nothing in the corn/genotype folder. The corn/genotype folder is empty.

Thanks, Le

From: Arno van Hilten @.> Sent: 2023年9月26日 15:42 To: ArnovanHilten/GenNet @.> Cc: LV Le CNBC @.>; Mention @.> Subject: Re: [ArnovanHilten/GenNet] @ArnovanHilten error message for regression mode (Issue #91)

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.


The error seems simple it cannot find the file: /scratch-large/4-quarterly/s1198162/GenNet/corn//genotype/0_corn.h5 can you check this folder?

— Reply to this email directly, view it on GitHubhttps://github.com/ArnovanHilten/GenNet/issues/91#issuecomment-1734999335, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BCZZJTLZEVD6ZURAGKXORKLX4KBLNANCNFSM6AAAAAA5F6ROXI. You are receiving this because you were mentioned.Message ID: @.**@.>>

This message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited.

ArnovanHilten commented 1 year ago

python GenNet.py convert makes the genotype file but I think you simply missed a '/' at the end of the folder path.

Try this instead:

python GenNet.py convert -g /scratch-large/4-quarterly/s1198162/GenNet/vcf/ -o /scratch-large/4-quarterly/s1198162/GenNet/corn/ -study_name corn -vcf

Just make sure that /scratch-large/4-quarterly/s1198162/GenNet/corn/exists that there is a corn.vcf in /scratch-large/4-quarterly/s1198162/GenNet/vcf/

PS. Since your previous conversion failed it is probably best to delete all created files during the failed run (delete all created files and folders in the output folder)

lesyngenta commented 1 year ago

Arno,

The converting failed with the same error message. This time, I added “/” at the end of the folder file. Actually chuck files were generated. Seems only the last step for genotype.h5 creation failed. python GenNet.py convert -g /scratch-large/4-quarterly/s1198162/GenNet/vcf/ -o /scratch-large/4-quarterly/s1198162/GenNet/corn/ -study_name corn -vcf

Thanks, Le

@. @.

From: Arno van Hilten @.> Sent: 2023年9月26日 15:50 To: ArnovanHilten/GenNet @.> Cc: LV Le CNBC @.>; Mention @.> Subject: Re: [ArnovanHilten/GenNet] @ArnovanHilten error message for regression mode (Issue #91)

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.


python GenNet.pyhttp://GenNet.py convert makes the genotype file but I think you simply missed a '/' at the end of the folder path.

Try this instead:

python GenNet.pyhttp://GenNet.py convert -g /scratch-large/4-quarterly/s1198162/GenNet/vcf/ -o /scratch-large/4-quarterly/s1198162/GenNet/corn/ -study_name corn -vcf

Just make sure that /scratch-large/4-quarterly/s1198162/GenNet/corn/ exists that there is a corn.vcf in /scratch-large/4-quarterly/s1198162/GenNet/vcf/

— Reply to this email directly, view it on GitHubhttps://github.com/ArnovanHilten/GenNet/issues/91#issuecomment-1735011800, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BCZZJTI5EFAIVUFF3MEMRRLX4KCLPANCNFSM6AAAAAA5F6ROXI. You are receiving this because you were mentioned.Message ID: @.**@.>>

This message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited.

lesyngenta commented 1 year ago

Dear Arno,

Eventually I crated the genotype.h5, but when I trained the model, new error message was shown: Traceback (most recent call last): File "GenNet.py", line 296, in main() File "GenNet.py", line 22, in main train_regression(args) File "/scratch-large/4-quarterly/s1198162/GenNet/GenNet_utils/Train_network.py", line 313, in train_regression num_covariates=num_covariates) File "/scratch-large/4-quarterly/s1198162/GenNet/GenNet_utils/Create_network.py", line 288, in create_network_from_csv mask = scipy.sparse.coo_matrix(((matrix_ones), matrix_coord), shape = matrixshape) File "/SD5/people/s1198162/miniforge3/envs/GenNet/lib/python3.7/site-packages/scipy/sparse/coo.py", line 198, in init self._check() File "/SD5/people/s1198162/miniforge3/envs/GenNet/lib/python3.7/site-packages/scipy/sparse/coo.py", line 285, in _check raise ValueError('row index exceeds matrix dimensions') ValueError: row index exceeds matrix dimensions

I have 540 genotypes and ID doesn’t exceed this, do you have any idea what is the reason?

Thanks, Le

From: Arno van Hilten @.> Sent: 2023年9月26日 15:50 To: ArnovanHilten/GenNet @.> Cc: LV Le CNBC @.>; Mention @.> Subject: Re: [ArnovanHilten/GenNet] @ArnovanHilten error message for regression mode (Issue #91)

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.


python GenNet.pyhttp://GenNet.py convert makes the genotype file but I think you simply missed a '/' at the end of the folder path.

Try this instead:

python GenNet.pyhttp://GenNet.py convert -g /scratch-large/4-quarterly/s1198162/GenNet/vcf/ -o /scratch-large/4-quarterly/s1198162/GenNet/corn/ -study_name corn -vcf

Just make sure that /scratch-large/4-quarterly/s1198162/GenNet/corn/ exists that there is a corn.vcf in /scratch-large/4-quarterly/s1198162/GenNet/vcf/

— Reply to this email directly, view it on GitHubhttps://github.com/ArnovanHilten/GenNet/issues/91#issuecomment-1735011800, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BCZZJTI5EFAIVUFF3MEMRRLX4KCLPANCNFSM6AAAAAA5F6ROXI. You are receiving this because you were mentioned.Message ID: @.**@.>>

This message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited.