r3fang / SnapATAC

Analysis Pipeline for Single Cell ATAC-seq
GNU General Public License v3.0
306 stars 126 forks source link

runMACS error #7

Closed RZiffra closed 5 years ago

RZiffra commented 5 years ago

Hello,

I am running into the following error when I try to call peaks using the runMACS function:

peaks_C16.df = runMACS(obj = combined[which(combined@cluster==16),],tmp.folder=getwd(),output.prefix="combined",path.to.snaptools="/home/rziffra/.local/bin/snaptools",path.to.macs="/usr/local/bin/macs2",gsize="hs",buffer.size=500,macs.options="--nomodel --shift 37 --ext 73 --pvalue 1e-2 -B --SPMR --call-summits",num.cores=2) Epoch: checking input parameters ... Epoch: extracting fragments from each snap files ... INFO @ Thu, 28 Mar 2019 10:43:59: Command line: callpeak -t /media/RND/HDD-5/rziffra/V1_GW20_SC/file290a2d88f415.bed.gz -f BED -g hs --nomodel --shift 37 --ext 73 --pvalue 1e-2 -B --SPMR --call-summits -n combined ARGUMENTS LIST: name = combined format = BED ChIP-seq file = ['/media/RND/HDD-5/rziffra/V1_GW20_SC/file290a2d88f415.bed.gz'] control file = None effective genome size = 2.70e+09 band width = 300 model fold = [5, 50] pvalue cutoff = 1.00e-02 qvalue will not be calculated and reported as -1 in the final output. Larger dataset will be scaled towards smaller dataset. Range for calculating regional lambda is: 10000 bps Broad region calling is off Paired-End mode is off Searching for subpeak summits is on MACS will save fragment pileup signal per million reads

INFO @ Thu, 28 Mar 2019 10:43:59: #1 read tag files... INFO @ Thu, 28 Mar 2019 10:43:59: #1 read treatment tags... Exception ZeroDivisionError: 'integer division or modulo by zero' in 'MACS2.IO.Parser.GenericParser.tsize' ignored INFO @ Thu, 28 Mar 2019 10:44:00: #1 tag size is determined as 0 bps INFO @ Thu, 28 Mar 2019 10:44:00: #1 tag size = 0 INFO @ Thu, 28 Mar 2019 10:44:00: #1 total tags in treatment: 0 INFO @ Thu, 28 Mar 2019 10:44:00: #1 user defined the maximum tags... INFO @ Thu, 28 Mar 2019 10:44:00: #1 filter out redundant tags at the same location and the same strand by allowing at most 1 tag(s) INFO @ Thu, 28 Mar 2019 10:44:00: #1 tags after filtering in treatment: 0 Traceback (most recent call last): File "/usr/local/bin/macs2", line 617, in main() File "/usr/local/bin/macs2", line 57, in main run( args ) File "/usr/local/lib/python2.7/dist-packages/MACS2/callpeak_cmd.py", line 112, in run info("#1 Redundant rate of treatment: %.2f", float(t0 - t1) / t0) ZeroDivisionError: float division by zero Error in runMACS(obj = combined[which(combined@cluster == 16), ], tmp.folder = getwd(), : 'MACS' call failed

r3fang commented 5 years ago

Hello,

can you check if /media/RND/HDD-5/rziffra/V1_GW20_SC/file290a2d88f415.bed.gz is empty?

RZiffra commented 5 years ago

It is empty. So is the issue with snaptools pulling the fragments out of the snap files? It does successfully create barcode files, but it looks like all the barcodes get concatenated into a single long string.

r3fang commented 5 years ago

that might be the problem. When you create the combined objects, did you manually change the barcode?

RZiffra commented 5 years ago

no. I just ran the following to create the object:

file.name = c("/media/RND/HDD-5/rziffra/V1_GW20_SC/Rep1.snap","/media/RND/HDD-5/rziffra/V1_GW20_SC/Rep2.snap") combined = createSnap(file.name, sample = c("Rep1","Rep2"),num.cores = 4)

r3fang commented 5 years ago

It seems the barcode in the snap file does not match with the barcode in the snap object. Can you do the following:

combined@barcode[1:10]

see if the barcodes have been modified.

RZiffra commented 5 years ago

Hi Rongxin,

just ran the following which seemed to fix the barcode file concatenation:

I pulled the following lines out of the runMACS.R file:

barcode.use = obj@barcode[idx] write.table(barcode.use, file = barcode.files[[i]], append = FALSE, quote = FALSE, sep = "\t", eol = "\n", na = "NA", dec = ".", row.names = FALSE, col.names = FALSE, qmethod = c("escape", "double"), fileEncoding = "")

I tried the following:

barcode.use = combined@barcode[1:100] write.table(barcode.use, file = "test.txt", append = FALSE, quote = FALSE, sep = "\t", eol = "\n", na = "NA", dec = ".", row.names = FALSE, col.names = FALSE, qmethod = c("escape", "double"), fileEncoding = "")

the barcode file is concatenated into a single string.

Then I made the following change:

barcode.use = combined@barcode[1:100] write.table(barcode.use, file = "test.txt", append = FALSE, quote = FALSE, sep = "\t", eol = "\r\n", na = "NA", dec = ".", row.names = FALSE, col.names = FALSE, qmethod = c("escape", "double"), fileEncoding = "")

This gives me a file with each barcode on a separate line.

r3fang commented 5 years ago

Interesting. What’s output of

combined@barcode[1:100]

I want to make sure the barcode has no problem. Also, what’s the python version you are using for snaptools

RZiffra commented 5 years ago

sorry Rongxin, just realized it was the text editor i was using to examine the files - when I just do:

less barcodefile.txt

each barcode is separated on a single line with your original code.

Here's the output of combined@barcode[1:100]: [1] "AAACGAAAGAGAGTAG" "AAACGAACACGTTACA" "AAACGAACAGGGAGTT" [4] "AAACGAACATATAGAG" "AAACGAACATATAGTG" "AAACGAAGTGATCAGG" [7] "AAACGAATCCACGAGC" "AAACGAATCCCAGCGA" "AAACGTACATATAGAG" [10] "AAACGTAGTGATCAGG" "AAACTCGAGATGCGCA" "AAACTCGAGTTGCGCA" [13] "AAACTCGAGTTTGGAA" "AAACTCGCAATTCTCT" "AAACTCGGTAGGTGAC" [16] "AAACTCGGTCAGCAAG" "AAACTCGGTCGCTACG" "AAACTCGGTCTGTCCT" [19] "AAACTCGGTGTCGTCG" "AAACTCGTCTTCACTA" "AAACTCGTGATGCGCA" [22] "AAACTGCAGCACGTAG" "AAACTGCCACGCTCAG" "AAACTGCTCTCCGTGG" [25] "AAACTGCTCTCCTTGG" "AAAGATGAGATCACCT" "AAAGATGAGCTGAAAT" [28] "AAAGATGCAGTAACTC" "AAAGATGTCGGGCTCA" "AAAGATGTCTATCTTG" [31] "AAAGGATGTCAGCAAG" "AAAGGATGTTCTTTCA" "AAAGGGCAGAGCTGTG" [34] "AAAGGGCCACCTATTT" "AAAGGGCCACTGCTCT" "AAAGGGCCAGGGTAAC" [37] "AAAGGGCCATTACTTC" "AAAGGGCGTAGATTAG" "AAAGGGCGTAGTTTAG" [40] "AAAGGGCGTGAATGGC" "AAAGGGCTCAGAATGA" "AAAGGGCTCGATGTAC" [43] "AAAGGTTGTCAGCAAG" "AAAGTTGTCGGGCTCA" "AAATGAGAGAGAAGCA" [46] "AAATGAGCATGCACTA" "AAATGAGGTAACGGAC" "AAATGAGGTTGCCTGG" [49] "AAATGAGTCCGTACGG" "AAATGAGTCCTCTCTT" "AAATGAGTCGAACACT" [52] "AAATGAGTCGAGCGCT" "AAATGAGTCTCCTTGG" "AAATGAGTGAGAAGCA" [55] "AAATGCCAGCGAGCTA" "AAATGCCAGGATAGTG" "AAATGCCAGTATAGTG" [58] "AAATGCCCAAACGTTC" "AAATGCCCAACGAGGT" "AAATGCCCAACGTGGT" [61] "AAATGCCCAGTAGTTC" "AAATGCCCATATTGGC" "AAATGCCGTAGCGAGT" [64] "AAATGCCGTCAGAAGC" "AAATGCCGTGAGGGTT" "AAATGCCGTGGTAAGC" [67] "AAATGCCGTGTGGGTT" "AAATGCCTCTGAGTCA" "AAATGTGAGAGAAGCA" [70] "AAATGTGTCCGTACGG" "AAATGTGTCCTCTCTT" "AAATGTGTCGAGCGCT" [73] "AACAAAGAGACCGCAA" "AACAAAGAGGATCCTT" "AACAAAGCAATTGCCA" [76] "AACAAAGCACTCCTCA" "AACAAAGCATCACAAC" "AACAAAGCATTACTCT" [79] "AACAAAGTCCACGAGC" "AACAAAGTCCACGCTT" "AACAAAGTGACCGCAA" [82] "AACAGTCCAATGTAAG" "AACAGTCCAGGTAGCA" "AACAGTCGCCGGAAAG" [85] "AACAGTCGTATCCCTC" "AACAGTCTCAACGTGG" "AACAGTCTCAACTTGG" [88] "AACAGTCTCACGATTG" "AACAGTCTCACTGGTA" "AACAGTCTCCGGAAAG" [91] "AACAGTCTCCGGAATG" "AACAGTCTCCGGTAAG" "AACAGTCTCCGGTATG" [94] "AACAGTCTCCGGTTTG" "AACAGTCTCCGTGCGA" "AACAGTCTCCGTGCGT" [97] "AACATCGAGAGTGGAA" "AACATCGCAAACGACG" "AACATCGCAATGATGA" [100] "AACATCGCAATGATGT"

I am using python2.7 for snaptools.

r3fang commented 5 years ago

last thing, can you run snaptools dump-barcode --snap-file=/media/RND/HDD-5/rziffra/V1_GW20_SC/Rep1.snap --output-file=Rep1.barcode.txt and show me the first few lines of the Rep1.barcode.txt.

RZiffra commented 5 years ago

rziffra@RNDGenomeCenter:/media/RND/HDD-5/rziffra/V1_GW20_SC$ snaptools dump-barcode --snap-file=Rep1.snap --output-file=Rep1.barcode.txt rziffra@RNDGenomeCenter:/media/RND/HDD-5/rziffra/V1_GW20_SC$ head Rep1.barcode.txt Barcode TN UM SE SA PE PP PL US UQ CM AAAAAAAAAAAAAAAA 141 7 0 0 5 5 5 5 0 AAAAAAGAGCTTCAAC 113 96 0 0 96 96 96 95 0 AAAAAAGTCGCCACTT 228 168 0 0 168 168 168 166 0 AAAAAGACATGGTAAA 134 94 0 0 94 94 94 94 0 AAAAAGAGCTTCAACT 253 243 0 0 243 243 243 203 0 AAAACCCAGACAACGC 108 97 0 0 97 97 97 96 0 AAAACCGCAGGTGTCC 228 198 0 0 196 196 196 196 0 AAAACCGTCTGTGTCC 122 114 0 0 114 114 114 113 0 AAAACCGTTTCGACAT 207 195 0 0 195 195 195 167 0

r3fang commented 5 years ago

when you runMACS, how long does Epoch: extracting fragments from each snap files ... last? I think the error is because snaptools fail to extract the reads.

Meanwhile, can you try this: 1) create a txt file (barcode.txt) with the following 9 barcodes

AAAAAAAAAAAAAAAA
AAAAAAGAGCTTCAAC
AAAAAAGTCGCCACTT
AAAAAGACATGGTAAA
AAAAAGAGCTTCAACT
AAAACCCAGACAACGC
AAAACCGCAGGTGTCC
AAAACCGTCTGTGTCC
AAAACCGTTTCGACAT

2) dump reads that belonging to these 9 barcodes using snaptools:

snaptools dump-fragment --snap-file=Rep1.snap --output-file=Rep1.barcode.bed.gz --barcode-file=barcode.txt 

does it work?

RZiffra commented 5 years ago

It does not last long at all - maybe a couple seconds.

Tried this test - the resulting bed file is empty.

r3fang commented 5 years ago

Thanks! Obviously, dump-fragment failed. I will test it again and fix this issue. Thanks for report this error

r3fang commented 5 years ago

Hi Ryan,

I tested it again and it does not have this problem.

Can you updated snaptools to the latest version (1.4.3) by 'pip install snaptools' and try it again?

RZiffra commented 5 years ago

Hi Rongxin,

That seems to have fixed the issue!

Thanks!

r3fang commented 5 years ago

Glad to know!