Open kir1to455 opened 1 day ago
Hello
What is the specification of your system? This kernel error message is unlikely to have anything to do with f5c. I have seen this error a few times on single-board computers running unstable operating systems when the system load is high.
Is there a reason you are using loops in your script? Are these multiple samples that you are iterating using i
?
If it is a single sample, what about
Again, is there a reason why you are going over chromosomes individually, rather than doing it at one?
If you want to keep the loop approach, I suggest at least the following.
Hi, @hasindu2008
My system is CentOS Linux 7. I only have two samples : input and ip.
merging all the BLOW5 files into one BLOW5 combining all FASTQ into one file merging and sorting all BAM files into one file Then running a single f5c index followed by an f5c eventalign
In fact, I have always used the first method and no bug. Excellent!
Again, is there a reason why you are going over chromosomes individually, rather than doing it at one?
The reason why I want to split the chromosome is because the eventalign file of RNA004 is too large (nearly 1T) , and I want to split it according to the chromosome. To split the chromosome will help me in machine learning.
The approach 1 is use split chromosome (like chr1.fq) mapping to chr1.bam, then run index with merge BLOW5 and run f5c eventalign.
The approach 2 is use whole.fq mapping to whole.bam and run f5c eventalign, then split the whole eventalign file to chrom.eventalign.tsv.
It looked like the first method might take less time, so I tried the first one. I'll try the second approach.
Best wishes, Kirito
Hi, @hasindu2008
When I use f5c to eventalign RNA004 data, the system popped up this bug and crashed. Here are a few screenshots.
Here is my code:
for i in input; do
echo $i
mkdir -p $blow5_dir/${i}
mkdir -p $Event_dir/${i}
blue-crab p2s ${Pod5Dir}/${i}/pod5_pass -d $blow5_dir/${i} -t 30 -p 10 ### pod5 to slow5
slow5tools merge $blow5_dir/${i} -o ${blow5_dir}/${i}_sup.pass.blow5 -t 30
Bamfiles=(
find $BamDir/${i} -name "${i}_merge_sup_chr*.sorted.bam")
for bfile in ${Bamfiles[@]}; do
echo $bfile
bbase=basename $bfile .sorted.bam
${f5c_dir}/f5c index -t 40 ${FastqDir}/${i}/${bbase}.fastq --slow5 ${blow5_dir}/${i}_sup.merge.blow5
${f5c_dir}/f5c eventalign --reads ${FastqDir}/${i}/${bbase}.fastq --bam $bfile --genome
${index_dir}/gencode.vM33.normal.transcripts.fa --slow5 ${blow5_dir}/${i}_sup.merge.blow5 -t 30 --kmer-model ${f5c_dir}/test/rna004-models/rna004.nucleotide.5mer.model --min-mapq 0 --secondary=no --rna --signal-index --scale-events --collapse-events --samples -B 14M -K 1024 --cuda-dev-id 0 --summary ${Event_dir}/${i}/${bbase}_nanopolish.summary.txt | pigz > ${Event_dir}/${i}/${bbase}.eventalign.tsv.gz ## --samples raw events --collapse-events
done
done
` I index the fastq file of the chromosome with merge blow5 before running f5c eventalign each time.And my log file, I found the some reads were not found in file.
Is it not possible to use the split fastq files(chr1...chr2...chr3...) to index the integrated blow5 files?
Best wishes, Kirito