JiekaiLab / scTE

MIT License
97 stars 27 forks source link

Stuck while quantifying! #26

Open me37uday opened 2 years ago

me37uday commented 2 years ago

Hi,

Thanks for scTE.

I am trying to run scTE on 4 samples using the following command :

#!/bin/bash

scTE -i *.bam -o out -x /home/urangasw/Softwares/scTE/out.inclusive.idx --min_genes 100 --min_counts 400 -p 16

I use to following parameters to submit the job :

sbatch --ntasks=1 --cpus-per-task=32 --mem=90000mb --partition=long1 --time=48:00:00 --qos=fastlane temp.sh It's been more than a day now and the programs seems to be stuck at :

INFO    : Parameter list:
Sample = out
Reference annotation index = /home/urangasw/Softwares/scTE/out.inclusive.idx
Minimum number of genes required = 100
Minimum number of counts required = 400
Number of threads = 16

INFO    : Loading the genome annotation index... 2021-12-26 11:45:22
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Finished loading the genome annotation index... 2021-12-26 11:46:20

INFO    : Processing BAM/SAM files ...2021-12-26 11:46:20
INFO    : Input SAM/BAM file appears to be valid
INFO    : Using parabam2bed as more than 1 input BAM
['1', '10', '10_GL383545V1_ALT', '10_GL383546V1_ALT', '10_KI270824V1_ALT', '10_KI270825V1_ALT', '10_KN196480V1_FIX', '10_KN538365V1_FIX', '10_KN538366V1_FIX', '10_KN538367V1_FIX', '10_KQ090020V1_ALT', '$
sed: couldn't write 49 items to stdout: Broken pipe
sed: couldn't write 49 items to stdout: Broken pipe
sed: couldn't write 54 items to stdout: Broken pipe
awk: cmd. line:1: (FILENAME=- FNR=347939132) fatal: print to "standard output" failed (Broken pipe)
samtools view: writing to standard output failed: Broken pipe
samtools view: error closing standard output: -1
sed: couldn't write 49 items to stdout: Broken pipe
sed: couldn't write 49 items to stdout: Broken pipe
sed: couldn't write 54 items to stdout: Broken pipe
awk: cmd. line:1: (FILENAME=- FNR=538652353) fatal: print to "standard output" failed (Broken pipe)
samtools view: writing to standard output failed: Broken pipe
samtools view: error closing standard output: -1
INFO    : Done BAM/SAM files processing ...2021-12-26 15:14:43

INFO    : Splitting ...2021-12-26 15:14:44
INFO    : Executing multiple thread path with 16 threads
UR CR
UR CR
UR CR
UR CR
INFO    : Finished processing sample files 2021-12-26 18:12:14

INFO    : Fetching from the annotation index... 2021-12-26 18:12:14
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items

Each of the 4 BAM files are approx 40GB in size. Please let me know what I should change to run the program successfully. Thanks in advance!

jphe commented 2 years ago

I feel it's run out of memory, scTE takes ~10Gb memory each thread for human and mouse genome, you can try with fewer thread.

me37uday commented 2 years ago

Thanks for your response.

By fewer threads you mean to reduce the -p parameter of the command? How long would it usually take for the job to finish for 4 samples of size 40GB each, any idea? I ask because the program doesn't terminate by itself and instead seems to be running until the duration allocated for it on the server.

Thanks, Uday

jphe commented 2 years ago

Yes, set with the -p parameter.

It usually takes 1-2 hours for the bam file with 40-60GB size in my server with 4 threads.

me37uday commented 2 years ago

Thanks :)

Are the bam files already sorted and indexed prior to scTE?

me37uday commented 2 years ago

I tried :

scTE -i *.bam -o out -x /home/urangasw/Softwares/scTE/out.inclusive.idx --min_genes 100 --min_counts 400 -p 4

Yet this happens :

INFO    : Parameter list:
Sample = out
Reference annotation index = /home/urangasw/Softwares/scTE/out.inclusive.idx
Minimum number of genes required = 100
Minimum number of counts required = 400
Number of threads = 4

INFO    : Loading the genome annotation index... 2021-12-28 12:44:41
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Finished loading the genome annotation index... 2021-12-28 12:45:31

INFO    : Processing BAM/SAM files ...2021-12-28 12:45:31
INFO    : Input SAM/BAM file appears to be valid
INFO    : Using parabam2bed as more than 1 input BAM
['1', '10', '10_GL383545V1_ALT', '10_GL383546V1_ALT', '10_KI270824V1_ALT', '10_KI270825V1_ALT', '10_KN196480V1_FIX', '10_KN538365V1_FIX', '10_KN538366V1_FIX', '10_KN538367V1_FIX', '10_KQ090020V1_ALT', '$
sed: couldn't write 51 items to stdout: Broken pipe
sed: couldn't write 51 items to stdout: Broken pipe
sed: couldn't write 56 items to stdout: Broken pipe
awk: cmd. line:1: (FILENAME=- FNR=325568226) fatal: print to "standard output" failed (Broken pipe)
samtools view: writing to standard output failed: Broken pipe
samtools view: error closing standard output: -1

Can you please tell me how much resource to allocate for this job on the server?

Currently it is :

sbatch --ntasks=1 --cpus-per-task=40 --mem=90000mb --partition=regular1 --time=10:00:00 --qos=fastlane temp.sh

jphe commented 2 years ago

You don't need to sort and index the bam file.

scTE takes ~10Gb memory each thread for human and mouse genome, you set with 4 threads, it needs 40-60GB memory.

me37uday commented 2 years ago

Okay, thanks for letting me know.

Still no luck.

I ran it on the server using the following command :

sbatch --ntasks=1 --cpus-per-task=32 --mem=60000mb --partition=regular1 --time=08:00:00 --qos=fastlane temp.sh

I get the same issue :

INFO    : Parameter list:
Sample = out
Reference annotation index = /home/urangasw/Softwares/scTE/out.inclusive.idx
Minimum number of genes required = 100
Minimum number of counts required = 400
Number of threads = 4

INFO    : Loading the genome annotation index... 2022-01-05 12:13:17
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Finished loading the genome annotation index... 2022-01-05 12:14:07

INFO    : Processing BAM/SAM files ...2022-01-05 12:14:07
INFO    : Input SAM/BAM file appears to be valid
INFO    : Using parabam2bed as more than 1 input BAM
['1', '10', '10_GL383545V1_ALT', '10_GL383546V1_ALT', '10_KI270824V1_ALT', '10_KI270825V1_ALT', '10_KN196480V1_FIX', '10_KN538365V1_FIX', '10_KN538366V1_FIX', '10_KN538367V1_FIX', '10_KQ090020V1_ALT', '$
[E::bgzf_read] Read block operation failed with error 4 after 0 of 4 bytes
[main_samview] truncated file.
samtools view: error closing "Patient4_possorted_genome_bam.bam": -1
sed: couldn't write 52 items to stdout: Broken pipe
sed: couldn't write 52 items to stdout: Broken pipe
sed: couldn't write 57 items to stdout: Broken pipe
awk: cmd. line:1: (FILENAME=- FNR=146761731) fatal: print to "standard output" failed (Broken pipe)
samtools view: writing to standard output failed: Broken pipe
samtools view: error closing standard output: -1

Worst part, the program doesn't stop running either until the server allocation time is done.

Any alternate way to get through this issue? Please let me know.

Thanks, Uday

me37uday commented 2 years ago

Also what is the expected output for running on the test data that came along with the package?

I get the following :

barcodes,4933401J01Rik,B1F,B1F1,B1F2,B1_Mm,B1_Mur2,B1_Mur3,B1_Mur4,B1_Mus1,B1_Mus2,B2_Mm2,B3,B3A,B4,B4A,ERVB4_1B-I_MM-int,Gm10568,Gm18956,Gm1992,Gm19938,Gm26206,Gm27396,Gm37180,Gm37329,Gm37363,Gm37381,G$
ATCGAGTGTTTCGCTC,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
CTAGAGTGTTTCGCTC,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
GACTAGTGTTTCGCTC,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
TCGAAGTGTTTCGCTC,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0

Is it correct? Or is the program not installed as it should be? Is there a vignette for scTE that I haven't come across?

Please let me know.

jphe commented 2 years ago

Seems your bam file are truncated, how do you generate the bam file?

me37uday commented 2 years ago

The bam files were generated using Cell Ranger v.3.0.2, I noticed I should have set the CB and UMI flags as follows :

scTE -i *.bam -o out -x /home/urangasw/Softwares/scTE/out.inclusive.idx -p 4 --keeptmp True -CB CB -UMI UB

However, I ended up getting the following :

INFO    : Parameter list:
Sample = out
Reference annotation index = /home/urangasw/Softwares/scTE/out.inclusive.idx
Minimum number of genes required = 200
Minimum number of counts required = None
Number of threads = 4

INFO    : Loading the genome annotation index... 2022-01-05 13:25:20
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Finished loading the genome annotation index... 2022-01-05 13:26:10

INFO    : Processing BAM/SAM files ...2022-01-05 13:26:10
ERROR   : The input file Patient2_possorted_genome_bam.bam has no cell barcodes information, plese make sure the aligner have add the cell barcode key, or set CB to False
['1', '10', '10_GL383545V1_ALT', '10_GL383546V1_ALT', '10_KI270824V1_ALT', '10_KI270825V1_ALT', '10_KN196480V1_FIX', '10_KN538365V1_FIX', '10_KN538366V1_FIX', '10_KN538367V1_FIX', '10_KQ090020V1_ALT', '$

The content of that particular bam file looks like this :

samtools view Patient2_possorted_genome_bam.bam | head -n 5                                                                                                            
NB501949:264:HY7YKBGX9:3:12609:17254:10890      272     1       10539   1       62M29S  *       0       0       CACCGAAATCTGTGCAGAGGAGAACGCAGCTCCGCCCTCGCGGTGCTCTCCGGGTCTGTGCTCCCCATGTACTCTGCGTTGATACCACTGCEEEEEEAEEEEE<AA<E/EEAEEEAEEEEAEAEEEEE<EEEEEAE</EEEEEEAEEEEAEEEEEEEEEEEEEEEEEEEEAEEEEEEAAAAA     NH:i:3  HI:i:2  AS:i:59 nM:i:1  RE:A:I  li:i:0  BC:Z:AGGCCCGA   QT:Z:AAAAAEEE   CR:Z:CACGTGGGTGCTTCAA   CY:Z:AAAAAEEEEEEEEEEE      CB:Z:CACGTGGGTGCTTCAA-1 UR:Z:TGACCATCAAGA       UY:Z:EEEEAEEEEEEE       UB:Z:TGACCATCAAGA       RG:Z:10X_Run4:0:1:HY7YKBGX9:3
NB501949:264:HY7YKBGX9:1:11302:18051:15704      256     1       11283   0       91M     *       0       0       GCGCCCCCTGCTGGCGCCGGGGCACTGCAGGGCCCTCTTGCTTACTGTATAGTGGTGGCACGCCGCCTGCTGGCAGCTAGGGACATTGCAGAAAAAEEEEEEEEEEEEEEEEEEEEEEEEE/EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAE<EEEAAEAEEEEEEEEEE<EEEEEEA     NH:i:6  HI:i:3  AS:i:89 nM:i:0  RE:A:I  li:i:0  BC:Z:CCAAGATG   QT:Z:AAAAAEEE   CR:Z:TCTAACTTCTTACTGT   CY:Z:AAAAAEEEEEEEEEEE      CB:Z:TCTAACTTCTTACTGT-1 UR:Z:GGAGTTTTGCTT       UY:Z:EEEEEEEEEEEE       UB:Z:GGAGTTTTGCTT       RG:Z:10X_Run4:0:1:HY7YKBGX9:1
NB501949:264:HY7YKBGX9:4:21607:12429:13050      256     1       11294   0       26S65M  *       0       0       AAGCAGTGGTATCAACGCAGAGTACATGGGGCCGGGGCACTGCAGGGCCCTCTTGCTTACTGTATAGTGGTGGCACGCCGCCTGCTGGCAGAAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEE/EEEEEEEEE/EEEEAEEEEEEAEEEEEEEEEEEAEEEEEEEEEEAE/EEEEA     NH:i:6  HI:i:4  AS:i:61 nM:i:1  RE:A:I  li:i:0  BC:Z:TACGTGAC   QT:Z:AAAAAEEE   CR:Z:ACACCAACATTACTCT   CY:Z:AAAAAEEEEEEEEEEE      CB:Z:ACACCAACATTACTCT-1 UR:Z:GTTGGCTTTGAT       UY:Z:EEEEEEEEEEEE       UB:Z:GTTGGCTTTGAT       RG:Z:10X_Run4:0:1:HY7YKBGX9:4
NB501949:264:HY7YKBGX9:3:13608:9852:8597        256     1       11297   0       91M     *       0       0       CGCCGGGGCACTGCAGGGCCCTCTTGCTTACTGTATAGTGGTGGCACGCCGCCTGCTGGCAGCTAGGGACATTGCAGGGTCCTCTTGCTCAA6AAAEEEAEEE6EEEEEEEEEEEEAEEEEEEEEEEEEEE/EEEEEEEEEA<EA<EEEEAE/EEEEEEEEAEEEEEEEEEE<EEEEEEEEE     NH:i:5  HI:i:2  AS:i:89 nM:i:0  RE:A:I  li:i:0  BC:Z:AGGCCCGA   QT:Z:AAAAAEEE   CR:Z:CCCGGAAAGAGATTCA   CY:Z:AAAAAEEEEEEEEEEE      CB:Z:CCCGGAAAGAGATTCA-1 UR:Z:ATTTAGGGGCAC       UY:Z:EEEEEEEEEEEE       UB:Z:ATTTAGGGGCAC       RG:Z:10X_Run4:0:1:HY7YKBGX9:3
NB501949:264:HY7YKBGX9:3:23512:8334:16549       256     1       11305   0       91M     *       0       0       CACTGCAGGGCCCTCTTGCTTACTGTATAGTGGTGGCACGCCGCCTGCTGGCAGCTAGGGACATTGCAGGGTCCTCTTGCTCAAGCTGTAGAAAAAEEEEEEAEEAA/EA/EA/E//E/EEEAE/EEEEE/EEEEE//EE//EEEEEEEE//AE/EAEEAE/AEAE6</AE/EEE//</EE/     NH:i:5  HI:i:2  AS:i:87 nM:i:1  RE:A:I  li:i:0  BC:Z:AGGCCCGA   QT:Z:AAAAAEE/   CR:Z:AAGTCGTAGCGTGAGT   CY:Z:AAAAAEAEEEEAEEEE      CB:Z:AAGTCGTAGCGTGAGT-1 UR:Z:ACTACATTGTCA       UY:Z:EEEEEEEEEEEE       UB:Z:ACTACATTGTCA       RG:Z:10X_Run4:0:1:HY7YKBGX9:3 

I am sorry I do not understand the bam completely. But does it look truncated/missing barcode information from the snippet I have provided? Do I need to filter the bam file in a certain way?

jphe commented 2 years ago

You can test the bam file for the top 5000 reads to have a quick check

samtools view Patient2_possorted_genome_bam.bam | head -5000| grep "CB:Z:" | wc -l

me37uday commented 2 years ago

The command returned 4955. So I need to filter?

jphe commented 2 years ago

Yes, you need to filter the bam file before running scTE

me37uday commented 2 years ago

I get the following error even after filtering :

INFO    : Parameter list:
Sample = out
Reference annotation index = /home/urangasw/Softwares/scTE/out.inclusive.idx
Minimum number of genes required = 200
Minimum number of counts required = None
Number of threads = 4

INFO    : Loading the genome annotation index... 2022-01-07 01:55:11
INFO    : Loaded '/home/urangasw/Softwares/scTE/out.inclusive.idx' binary file with 5971706 items
INFO    : Finished loading the genome annotation index... 2022-01-07 01:56:02

INFO    : Processing BAM/SAM files ...2022-01-07 01:56:02
INFO    : Input SAM/BAM file appears to be valid
INFO    : Using parabam2bed as more than 1 input BAM
['1', '10', '10_GL383545V1_ALT', '10_GL383546V1_ALT', '10_KI270824V1_ALT', '10_KI270825V1_ALT', '10_KN196480V1_FIX', '10_KN538365V1_FIX', '10_KN538366V1_FIX', '10_KN538367V1_FIX', '10_KQ090020V1_ALT', '$
sed: couldn't write 52 items to stdout: Broken pipe
sed: couldn't write 52 items to stdout: Broken pipe
sed: couldn't write 57 items to stdout: Broken pipe
awk: cmd. line:1: (FILENAME=- FNR=286373960) fatal: print to "standard output" failed (Broken pipe)
samtools view: writing to standard output failed: Broken pipe
samtools view: error closing standard output: -1
sed: couldn't write 44 items to stdout: Broken pipe
sed: couldn't write 44 items to stdout: Broken pipe
sed: couldn't write 49 items to stdout: Broken pipe
awk: cmd. line:1: (FILENAME=- FNR=482660794) fatal: print to "standard output" failed (Broken pipe)
samtools view: writing to standard output failed: Broken pipe
samtools view: error closing standard output: -1
INFO    : Done BAM/SAM files processing ...2022-01-07 04:50:53

INFO    : Splitting ...2022-01-07 04:50:53
INFO    : Executing multiple thread path with 4 threads
UB CB
UB CB
UB CB

I used the following commands :

#!/bin/bash

scTE -i *.bam -o out -x /home/urangasw/Softwares/scTE/out.inclusive.idx -p 4 --keeptmp True -CB CB -UMI UB

sbatch --ntasks=1 --cpus-per-task=40 --mem=60000mb --partition=long1 --time=10:00:00 --qos=fastlane temp.sh

However, it seems to have solved the truncated file issue. Do you think this is due to memory? I don't understand why because I have been using 4 threads and allocating 60GB of memory of 10 hours and yet it seems to fail! Do you think I should increase the threads and the memory? Please let me know. Thanks!

jphe commented 2 years ago

It's also not clear for me, seems the Linux system sed problem.

How many folders exist in the directory of filename_scTEtmp (If the directory not exist, run scTE with --keeptmp True ) ? and can you paste a screenshot about how many files under eahc folder, the filename_scTEtmp directory is useful for debug.

And can you try command bellow to test if report any error?

samtools view -@ 4 Patient2_possorted_genome_bam.bam | awk '{OFS="\t"}{for(i=12;i<=NF;i++)if($i~/CB:Z:/)n=i}{for(i=12;i<=NF;i++)if($i~/UB:Z:/)m=i}{print $3,$4,$4+100,$n,$m}' | sed -r 's/CB:Z://g' | sed -r 's/UB:Z://g'| sed -r 's/^chr//g' | awk '!x[$4$5]++' | gzip -c > filename.o1.bed.gz

me37uday commented 2 years ago

I ran it on another server and it worked.

I used to following configurations (120 threads) :

!/bin/bash

#

SBATCH -p all

SBATCH -N 1

SBATCH --sockets-per-node=4

SBATCH --cores-per-socket=20

SBATCH --threads-per-core=2

SBATCH -t 24:00:00

SBATCH -w valis-02

To my surprise, it worked without me having to specify memory for the job.

Unsure why it wasn't working on the other server. But if I were to guess, I think it was because of the number of threads and the duration of server time. It took ~9 hours to complete when it finally did for 4 samples each of ~35-40GB in size.

Thanks for your time and scTE :)