Closed lesyngenta closed 1 year ago
Command line: python GenNet.py train -path /scratch-large/4-quarterly/s1198162/GenNet/train/ -ID 1 -epochs 50 -problem_type regression
Hi Le,
Can you show or upload the topology.csv? It should not exceed 540. Remember that zero is included in the count so the max should be 539.
Best,
Arno
Hi Arno,
I modified the topology.csv file and now the model structure can be built. However, when start training from scratch, it showed: /var/spool/slurmd/job165654/slurm_script: line 4: 34171 Segmentation fault (core dumped) python GenNet.py train -path /scratch-large/4-quarterly/s1198162/GenNet/train/ -ID 1 -epochs 1000 -problem_type regression
The GPU on our server contains 32G memory and I already decreased the topology.csv as much as possible, with matrix shape (460, 137). How come it can’t be run yet?
Thanks, Le
From: Arno van Hilten @.> Sent: 2023年9月29日 1:49 To: ArnovanHilten/GenNet @.> Cc: LV Le CNBC @.>; Author @.> Subject: Re: [ArnovanHilten/GenNet] @ArnovanHilten new error message (Issue #92)
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
Hi Le,
Can you show or upload the topology.csv? It should not exceed 540. Remember that zero is included in the count so the max should be 539.
Best,
Arno
— Reply to this email directly, view it on GitHubhttps://github.com/ArnovanHilten/GenNet/issues/92#issuecomment-1739764564, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BCZZJTPF3UEMMAI3JJ7ULGTX4WZ7FANCNFSM6AAAAAA5I2CPKY. You are receiving this because you authored the thread.Message ID: @.**@.>>
This message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited.
Then I shifted to use CPU, the error message was more complicated:
Start training from scratch
WARNING:tensorflow:From /scratch-large/4-quarterly/s1198162/GenNet/GenNet_utils/Train_network.py:381: Model.fit_generator (from tensorflow.python.keras.engine.training) is deprecated and will be removed in a future version.
Instructions for updating:
Please use Model.fit, which supports generators.
Epoch 1/50
Traceback (most recent call last):
File "GenNet.py", line 296, in
Errors may have originated from an input operation. Input Source operations connected to node model/LocallyDirected_0/SparseTensorDenseMatMul/SparseTensorDenseMatMul: model/LocallyDirected_0/Reshape (defined at /scratch-large/4-quarterly/s1198162/GenNet/GenNet_utils/LocallyDirected1D.py:223) model/LocallyDirected_0/strided_slice_1 (defined at /scratch-large/4-quarterly/s1198162/GenNet/GenNet_utils/LocallyDirected1D.py:181)
Function call stack: train_function
Closing remaining open files:/scratch-large/4-quarterly/s1198162/GenNet/train//genotype.h5...done Segmentation fault (core dumped)
From: LV Le CNBC Sent: 2023年10月6日 20:08 To: 'ArnovanHilten/GenNet' @.***> Subject: RE: [ArnovanHilten/GenNet] @ArnovanHilten new error message (Issue #92)
Hi Arno,
I modified the topology.csv file and now the model structure can be built. However, when start training from scratch, it showed: /var/spool/slurmd/job165654/slurm_script: line 4: 34171 Segmentation fault (core dumped) python GenNet.py train -path /scratch-large/4-quarterly/s1198162/GenNet/train/ -ID 1 -epochs 1000 -problem_type regression
The GPU on our server contains 32G memory and I already decreased the topology.csv as much as possible, with matrix shape (460, 137). How come it can’t be run yet?
Thanks, Le
From: Arno van Hilten @.**@.>> Sent: 2023年9月29日 1:49 To: ArnovanHilten/GenNet @.**@.>> Cc: LV Le CNBC @.**@.>>; Author @.**@.>> Subject: Re: [ArnovanHilten/GenNet] @ArnovanHilten new error message (Issue #92)
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
Hi Le,
Can you show or upload the topology.csv? It should not exceed 540. Remember that zero is included in the count so the max should be 539.
Best,
Arno
— Reply to this email directly, view it on GitHubhttps://github.com/ArnovanHilten/GenNet/issues/92#issuecomment-1739764564, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BCZZJTPF3UEMMAI3JJ7ULGTX4WZ7FANCNFSM6AAAAAA5I2CPKY. You are receiving this because you authored the thread.Message ID: @.**@.>>
This message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited.
Dear Arno,
Sorry for all the troubling. As you suggested, the problems were caused by topology.csv. I corrected the error and regenerated that file, now everything goes fine. Thank you so much! I will close the issue.
Best, Le
From: LV Le CNBC Sent: 2023年10月6日 20:27 To: ArnovanHilten/GenNet @.***> Subject: RE: [ArnovanHilten/GenNet] @ArnovanHilten new error message (Issue #92)
Then I shifted to use CPU, the error message was more complicated:
Start training from scratch
WARNING:tensorflow:From /scratch-large/4-quarterly/s1198162/GenNet/GenNet_utils/Train_network.py:381: Model.fit_generator (from tensorflow.python.keras.engine.training) is deprecated and will be removed in a future version.
Instructions for updating:
Please use Model.fit, which supports generators.
Epoch 1/50
Traceback (most recent call last):
File "GenNet.py", line 296, in
Errors may have originated from an input operation. Input Source operations connected to node model/LocallyDirected_0/SparseTensorDenseMatMul/SparseTensorDenseMatMul: model/LocallyDirected_0/Reshape (defined at /scratch-large/4-quarterly/s1198162/GenNet/GenNet_utils/LocallyDirected1D.py:223) model/LocallyDirected_0/strided_slice_1 (defined at /scratch-large/4-quarterly/s1198162/GenNet/GenNet_utils/LocallyDirected1D.py:181)
Function call stack: train_function
Closing remaining open files:/scratch-large/4-quarterly/s1198162/GenNet/train//genotype.h5...done Segmentation fault (core dumped)
From: LV Le CNBC Sent: 2023年10月6日 20:08 To: 'ArnovanHilten/GenNet' @.**@.>> Subject: RE: [ArnovanHilten/GenNet] @ArnovanHilten new error message (Issue #92)
Hi Arno,
I modified the topology.csv file and now the model structure can be built. However, when start training from scratch, it showed: /var/spool/slurmd/job165654/slurm_script: line 4: 34171 Segmentation fault (core dumped) python GenNet.py train -path /scratch-large/4-quarterly/s1198162/GenNet/train/ -ID 1 -epochs 1000 -problem_type regression
The GPU on our server contains 32G memory and I already decreased the topology.csv as much as possible, with matrix shape (460, 137). How come it can’t be run yet?
Thanks, Le
From: Arno van Hilten @.**@.>> Sent: 2023年9月29日 1:49 To: ArnovanHilten/GenNet @.**@.>> Cc: LV Le CNBC @.**@.>>; Author @.**@.>> Subject: Re: [ArnovanHilten/GenNet] @ArnovanHilten new error message (Issue #92)
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
Hi Le,
Can you show or upload the topology.csv? It should not exceed 540. Remember that zero is included in the count so the max should be 539.
Best,
Arno
— Reply to this email directly, view it on GitHubhttps://github.com/ArnovanHilten/GenNet/issues/92#issuecomment-1739764564, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BCZZJTPF3UEMMAI3JJ7ULGTX4WZ7FANCNFSM6AAAAAA5I2CPKY. You are receiving this because you authored the thread.Message ID: @.**@.>>
This message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited.
Dear Arno,
Eventually I crated the genotype.h5, but when I trained the model, new error message was shown: Traceback (most recent call last): File "GenNet.py", line 296, in
main()
File "GenNet.py", line 22, in main
train_regression(args)
File "/scratch-large/4-quarterly/s1198162/GenNet/GenNet_utils/Train_network.py", line 313, in train_regression
num_covariates=num_covariates)
File "/scratch-large/4-quarterly/s1198162/GenNet/GenNet_utils/Create_network.py", line 288, in create_network_from_csv
mask = scipy.sparse.coo_matrix(((matrix_ones), matrix_coord), shape = matrixshape)
File "/SD5/people/s1198162/miniforge3/envs/GenNet/lib/python3.7/site-packages/scipy/sparse/coo.py", line 198, in init
self._check()
File "/SD5/people/s1198162/miniforge3/envs/GenNet/lib/python3.7/site-packages/scipy/sparse/coo.py", line 285, in _check
raise ValueError('row index exceeds matrix dimensions')
ValueError: row index exceeds matrix dimensions
I have 540 genotypes and ID doesn’t exceed this, do you have any idea what is the reason?
Thanks, Le