homeveg / nuctools

software for analysis of chromatin feature occupancy profiles from high-throughput sequencing data
GNU General Public License v3.0
15 stars 3 forks source link

aggregating occupancy profile around certain features at different resolution to the original occupancy file #8

Open chrisclarkson100 opened 6 years ago

chrisclarkson100 commented 6 years ago

Hi, Is it possible to calculate the average aggregate occupancy of a feature across all of the regions on a chromosome at a resolution that is different to that of the occupancy file that was calculated for that chromosome. Take for example an occupancy file calculated at 1 bp resolution, can one then aggregate the occupancies at a 100 bp resolution? I have tried this for all of the chromosomes as below but the aggregate files that are produced are wrong....

for CHR in {1..19} X Y
do
      perl /storage/projects/teif/nuctools.3.0/bed2occupancy_average.pl --input=H3K4me1/chr${CHR}.bed --output=1_occup_chr${CHR}.bed --window=1;
      perl /storage/projects/teif/nuctools.2.0/aggregate_profile.pl --window=100 --regions=regions_OI.bed --input=H3K4me1/1_occup_chr${CHR}.bed --chromosome=chr${CHR} --verbose --useCenter --upstream_delta=20000 --downstream_delta=20000 --average_aligned=H3K4me1_chr${CHR}.w100 --chromosome_col=0 --region_start_column=1 --region_end_column=2 --strand_column=2 --GeneId_column=1 --ignore_strand --force --save_aligned --AgregateProfile;
done

head H3K4me1/1_occup_chr3.bed
3000000 379.062885209613
3000001 379.062885209613
3000002 379.062885209613
3000003 379.062885209613
3000004 379.062885209613
3000005 379.062885209613
3000006 379.062885209613
3000007 379.062885209613
3000008 379.062885209613
3000009 379.062885209613

head H3K4me1/H3K4me1_chr4.w100.delta_20000_20000.txt
20000 365.85209613
19900 320.88520961
19800 [no_value]
19700 [no_value]
19600 [no_value]
.........
........

The aggregate files are like this across all of the chromosomes...