nanoporetech / remora

Methylation/modified base calling separated from basecalling.
https://nanoporetech.com
Other
159 stars 21 forks source link

”ValueError: need at least one array to concatenate“ when using remora model train on test data #197

Open spoweekkk opened 1 day ago

spoweekkk commented 1 day ago

When trying to use the test data and following the pipeline, I missed "Not enough chunks" error, and then I followed your advice showed previously using the command:

"remora model train train_dataset.jsn --model ~/gpfs1/Software/remora/models/ConvLSTM_w_ref.py --chunk-context 50 50 --output-path train_results --overwrite --num-test-chunks 200"

I got the error: "[17:28:09.585] Seed selected is 711195172 [17:28:09.587] Loading dataset from Remora dataset config [17:28:09.604] Dataset summary: size : 415 kmer context bases : (4, 4) chunk context : (50, 50) reverse signal : False chunk extract base start : False chunk extract offset : 0 pa scaling : None sig map refiner : Loaded 9-mer table with 7 central position. Rough re-scaling will be executed. batches preloaded : False is modbase dataset? : True mod bases : ['m'] mod long names : ['5mC'] motifs : [('CG', 0)]

[17:28:09.605] Loading model [17:28:09.613] Model structure: network( (sig_conv1): Conv1d(1, 4, kernel_size=(5,), stride=(1,)) (sig_bn1): BatchNorm1d(4, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (sig_conv2): Conv1d(4, 16, kernel_size=(5,), stride=(1,)) (sig_bn2): BatchNorm1d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (sig_conv3): Conv1d(16, 64, kernel_size=(9,), stride=(3,)) (sig_bn3): BatchNorm1d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (seq_conv1): Conv1d(36, 16, kernel_size=(5,), stride=(1,)) (seq_bn1): BatchNorm1d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (seq_conv2): Conv1d(16, 64, kernel_size=(13,), stride=(3,)) (seq_bn2): BatchNorm1d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (merge_conv1): Conv1d(128, 64, kernel_size=(5,), stride=(1,)) (merge_bn): BatchNorm1d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (lstm1): LSTM(64, 64) (lstm2): LSTM(64, 64) (fc): Linear(in_features=64, out_features=2, bias=True) (dropout): Dropout(p=0.3, inplace=False) ) [17:28:09.617] Gradients will be clipped (by value) at 0.00 MADs above the median of the last 1000 gradient maximums. [17:28:09.765] Params (k) 134.08 | MACs (M) 7327.45 [17:28:09.765] Preparing training settings [17:28:09.766] Training optimizer and scheduler settings: TrainOpts(epochs=100, early_stopping=10, optimizer_str='AdamW', opt_kwargs=(('weight_decay', 0.0001, 'float'),), learning_rate=0.001, lr_scheduler_str='CosineAnnealingLR', lr_scheduler_kwargs=(('T_max', 100, 'int'), ('eta_min', 1e-06, 'float')), lr_cool_down_epochs=5, lr_cool_down_lr=1e-07) [17:28:10.865] Dataset loaded with labels: control:205; 5mC:210 [17:28:10.865] Train labels: control:105; 5mC:110 [17:28:10.865] Held-out validation labels: control:0; 5mC:0 [17:28:10.865] Training set validation labels: control:0; 5mC:0 [17:28:10.865] Running initial validation Batches: 0it [00:00, ?it/s] Traceback (most recent call last): File "/lustre2/jdhan_pkuhpc/common/mamba/envs/remora/bin/remora", line 8, in sys.exit(run()) File "/lustre2/jdhan_pkuhpc/common/mamba/envs/remora/lib/python3.8/site-packages/remora/main.py", line 71, in run cmd_func(args) File "/lustre2/jdhan_pkuhpc/common/mamba/envs/remora/lib/python3.8/site-packages/remora/parsers.py", line 1377, in run_model_train train_model( File "/lustre2/jdhan_pkuhpc/common/mamba/envs/remora/lib/python3.8/site-packages/remora/train_model.py", line 379, in train_model val_metrics = val_fp.validate_model( File "/lustre2/jdhan_pkuhpc/common/mamba/envs/remora/lib/python3.8/site-packages/remora/validate.py", line 282, in validate_model ms = self.run_validation( File "/lustre2/jdhan_pkuhpc/common/mamba/envs/remora/lib/python3.8/site-packages/remora/validate.py", line 247, in run_validation all_outputs = np.concatenate(all_outputs, axis=0) File "<__array_function__ internals>", line 200, in concatenate ValueError: need at least one array to concatenate"

Could you give me some advice on this error

marcus1487 commented 17 hours ago

It appears that the number of test chunks is indeed 0. It looks like there are enough chunks. Not sure why that would be. Could you try to extract a smaller number of test chunks, say 50, to see if this resolves the issue.