Closed ShaluJhanwar closed 9 years ago
NucleoATAC is working fine on example files. No idea what's wrong when I run it on my own data. BAM files are sorted. All the requirements are also satisfied. numpy.version 1.10.1 scipy.version 0.13.2 matplotlib.version 1.3.1 cython.version 0.23.4 pysam.version 0.8.3
Hi, the issue I think is with the scipy version-- can you try upgrading (to version 0.16.0 or greater) to see if it fixes issue? Someone else had this same error and updating scipy fixed the issue, and I meant to add the version of scipy to the setup.py script but apparently have not done so yet. It's definitely not an issue with the PWM. The entry won't exist as-is in your bedfile because there is some adjustment to peaks (extending slightly, merging overlapping peaks).
Hi Alicia, thanks for the reply. It's already approx. 5 hours on 5 processes, but one run of NucleoATAC for the mouse genome didn't yet finished. The output files are still growing. Could you please let me know how much time one run would take so that we can be sure if NucleoATAC is working. ii) There are different output generated e.g. occ* files, pos files etc. After reading Nucleosome paper, could you please confirm: a) I guess *nucpos.bed.gz is the file which will provide me nucleosome position? There are more than 3 columns, from where I can get their header information? b) Do I need to filter nucleosome positions data based on some cut-offs? What would you suggest?
Many thanks! Shalu
Yes *.nucpos.bed.gz is nucleosome positions. I would suggest to filter the positions based on the occupancy column. There are already cutoffs for z-score (can be changed by using the individual nucleoatac commands instead of 'run').
As for time, depends on how much of the genome your bed file covers...
Hi Alicia, After scipy version adding, one run of NucleoATAC finished with following: Command run: /users/GD/tools/NucleoATAC/NucleoATAC/bin/nucleoatac run --bed /no_backup/so/sjhanwar/WGBS_mouse/NucleoATAC/WNN1/WNN1_6640_fseq_peaks.bed --bam /no_backup/so/sjhanwar/WGBS_mouse/NucleoATAC/WNN1/WNN1_6640_AGGCAGA.sort.bam --fasta /users/GD/resource/mouse/mm9/full/mm9_NCBI73.fasta --out WNN1 --cores 8 nucleoatac version 0.3.0 start run at: 2015-10-28 08:50 ---------Step1: Computing Occupancy and Nucleosomal Insert Distribution--------- Making figure ---------Step2: Processing Vplot------------------------------------------------ /software/so/el6.3/PythonPackages-2.7.6/lib/python2.7/site-packages/matplotlib/collections.py:548: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison if self._edgecolors == 'face': ---------Step3: Obtaining nucleosome signal and calling positions--------------- ---------Step4: Making combined nucleosome position map ------------------------ ---------Step5: Calling NFR positions------------------------------------------- end run at: 2015-10-28 15:09
real 379m11.033s user 1013m29.183s sys 11m46.624s
i) Could we remove Step2 warning/error? ii) In the *.nucpos.bed.gz, there are 19390 nucleosomes with occupancy (5th column) ranged from 0 to 1. Which cutoff of occupancy would you suggest to filter nucleosome? iii) Z-score is ranging from 3 to 346. Should I consider all the nucleosomes with higher Z-score too? Thanks! Shalu
i) I need to figure out what that warning is, I think it started occurring with a recent version of matplotlib. ii) I would suggest a fairly liberal threshold around 0.1, but might be worth looking at histogram and/or relationship with other metrics iii) default Z-score cutoff is 3. depending on whether you want more confidence at the calls (at the expense of fewer calls) you can choose a higher threshold
Hi Alicia, Since NucleoATAC provided a single dyad position in the output. I want to plot the number of reads supported for each nucleosome position (Basically a bigwig file for visualisation purpose). We have only single bp of the dyad. Could you please suggest: i) Should I consider some flanking region of the nucleosome ii) If yes, how many bp flanking can provide an approximation of the actual nucleosome peak iii) Can we get actual peak size of the nucleosome? Thanks! Shalu
Hi,
I'm trying to use NucleoATAC on our in-house generated mouse ATAC-seq data, but I found below error even with different runs on different files. I'm afraid is it because I am using default (Human) PWM for mouse as well? The fact is that chr1 3640097 3640219 entry (see below) doesn't even exist in bed peak file. Could you help solving this? Thanks! Error: Command run: /users/GD/tools/NucleoATAC/NucleoATAC/bin/nucleoatac run --bed /no_backup/so/sjhanwar/WGBS_mouse/NucleoATAC/WNN1/WNN1_6640_fseq_peaks.bed --bam /no_backup/so/sjhanwar/WGBS_mouse/NucleoATAC/WNN1/WNN1_6640_AGGCAGA.sort.bam --fasta /users/GD/resource/mouse/mm9/full/mm9_NCBI73.fasta --out WNN1 nucleoatac version 0.3.0 start run at: 2015-10-26 17:12 ---------Step1: Computing Occupancy and Nucleosomal Insert Distribution--------- /software/so/el6.3/PythonPackages-2.7.6/lib/python2.7/site-packages/numpy/core/fromnumeric.py:2641: VisibleDeprecationWarning:
rank
is deprecated; use thendim
attribute or function instead. To find the rank of a matrix seenumpy.linalg.matrix_rank
. VisibleDeprecationWarning) Caught exception when processing: chr1 3640097 3640219 None None +Traceback (most recent call last): File "/software/so/el6.3/PythonPackages-2.7.6/lib/python2.7/site-packages/nucleoatac/run_occ.py", line 30, in _occHelper occ.process(params) File "/software/so/el6.3/PythonPackages-2.7.6/lib/python2.7/site-packages/nucleoatac/Occupancy.py", line 230, in process self.callPeaks() File "/software/so/el6.3/PythonPackages-2.7.6/lib/python2.7/site-packages/nucleoatac/Occupancy.py", line 209, in callPeaks peaks = call_peaks(self.occ.smoothed_vals, sep = self.params.sep, min_signal = self.params.min_occ) File "/software/so/el6.3/PythonPackages-2.7.6/lib/python2.7/site-packages/pyatac/utils.py", line 98, in call_peaks peaks = peaks[sigvals[peaks] >= min_signal ] IndexError: arrays used as indices must be of integer (or boolean) type () Traceback (most recent call last): File "/users/GD/tools/NucleoATAC/NucleoATAC/bin/nucleoatac", line 291, in
main()
File "/users/GD/tools/NucleoATAC/NucleoATAC/bin/nucleoatac", line 75, in main
run_occ(occ_args)
File "/software/so/el6.3/PythonPackages-2.7.6/lib/python2.7/site-packages/nucleoatac/run_occ.py", line 119, in run_occ
tmp = pool1.map(_occHelper, zip(j,itertools.repeat(params)))
File "/software/so/el6.3/PythonPackages-2.7.6/lib/python2.7/multiprocessing/pool.py", line 250, in map
return self.map_async(func, iterable, chunksize).get()
File "/software/so/el6.3/PythonPackages-2.7.6/lib/python2.7/multiprocessing/pool.py", line 554, in get
raise self._value
IndexError: arrays used as indices must be of integer (or boolean) type