I try to fine tune for Arabidopsis thaliana with more RNA seq data.
Below are the commands I used, but I got the error when I ran filter-to-most-certain.py
selecting 265 with average normalized distances below in each genic proportion ranking [0.032440142162364384, 0.034895248784137675, 0.03238402543958099, 0.03226711560044893, 0.015221076505798728]
INFO: the following arrays will be copied in their entirety and not be subset,
these are expected to relate to metadata:
['evaluation/rnaseq_meta/bam_files']
Traceback (most recent call last):
File "/tmp/global2/wxian/software/Helixer_fine_tuning/filter-to-most-certain.py", line 116, in <module>
main(args)
File "/tmp/global2/wxian/software/Helixer_fine_tuning/filter-to-most-certain.py", line 101, in main
copy_groups_recursively(h5_in, h5_out, skip_arrays=skip_groups, start_i=si, end_i=si + max_n_chunks,
File "/tmp/global2/wxian/software/Helixer_fine_tuning/n90_train_val_split.py", line 121, in copy_groups_recursively
h5_in.visititems(maybe_copy_some_data)
File "/tmp/global2/wxian/conda/envs/htseq/lib/python3.10/site-packages/h5py/_hl/group.py", line 668, in visititems
return h5o.visit(self.id, proxy)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5o.pyx", line 355, in h5py.h5o.visit
File "h5py/h5o.pyx", line 302, in h5py.h5o.cb_obj_simple
File "/tmp/global2/wxian/conda/envs/htseq/lib/python3.10/site-packages/h5py/_hl/group.py", line 667, in proxy
return func(name, self[name])
File "/tmp/global2/wxian/software/Helixer_fine_tuning/n90_train_val_split.py", line 119, in maybe_copy_some_data
copy_some_data(h5_in, h5_out, name, mask, start_i, end_i)
File "/tmp/global2/wxian/software/Helixer_fine_tuning/n90_train_val_split.py", line 105, in copy_some_data
keep_idxs = keep_idxs[mask]
IndexError: boolean index did not match indexed array along dimension 0; dimension is 1 but corresponding boolean dimension is 30
Hey, many thanks for this awesome tool !!!
I try to fine tune for Arabidopsis thaliana with more RNA seq data. Below are the commands I used, but I got the error when I ran filter-to-most-certain.py
https://raw.githubusercontent.com/weberlab-hhu/helixer_scratch/master/data_scripts/filter-to-most-certain.py https://raw.githubusercontent.com/weberlab-hhu/helixer_scratch/master/data_scripts/n90_train_val_split.py
commands:
Error message of filter-to-most-certain.py