Open kir1to455 opened 3 months ago
Interesting. Is this running in a conda environment or python environment? We occasionally see issues when running in conda.
Are you able to merge the remaining 20 files into the ip_merge.pod5 file?
Hi, @HalfPhoton
We occasionally see issues when running in conda.
I run this code in conda environment.
Are you able to merge the remaining 20 files into the ip_merge.pod5 file?
I don't know how pod5 merge handles the order of files. Like test_0.pod5...test_1.pod5... test_20.pod5? If so, I will try to merge it.
Best wishes, Kirito
ah - I see.
In this case please create a list of missing read ids from the first merged output and all inputs using pod5 view.
# get read ids
pod5 view -IH input_data/ -o input.ids
pod5 view -IH merged.pod5 -o merged.ids
# Sort the files (comm requires sorted files)
sort input.ids > input.ids.sorted
sort merged.ids > merged.ids.sorted
# Find ids in input that are not in merged file
comm -23 input.ids.sorted merged.ids.sorted > missing.ids
# Get a pod5 file of only missing ids
pod5 filter input_data/ --ids missing.ids -o missing.pod5
# Merge in missing ids
pod5 merge merged.pod5 missing.pod5 -o merged.final.pod5
I recommend using a python virtual environment instead of a conda environment:
python3.10 -m venv venv --prompt=pod5
source venv/bin/activate
pip install -U pip pod5
pod5 --version
Issue Description
Logs