Open xikanfeng2 opened 3 weeks ago
if len(shared_bins) < len(bin_coords):
chrom_names = reader.get_chrom_names(args['chrom_names'])
chrom_lens = reader.get_chrom_lens(args['reference'], chrom_names)
regions = reader.get_bins_from_chromlens(chrom_lens, args['bin_size'])
full_region_path = reader.write_region_file(args['out_dir'], 'full_regions.bed', regions)
gc = reader.get_gc(args['reference'], full_region_path, args['bedtools'])
mapp = reader.get_mapp(args['out_dir'], full_region_path, args['bigwig'], args['map_file'])
filtered_regions, filtered_stats = reader.filter_bins_gc_mapp(regions, gc, mapp)
reader.write_region_file(args['out_dir'], 'filtered_regions.bed', shared_bins, stats=filtered_stats)
reader.write_cell_names(os.path.join(args['out_dir'], 'cells.txt'), shared_cells)
print(readcounts_df.shape)
cell_avgs = readcounts_df.mean(axis=1)
RDR_df = readcounts_df.div(cell_avgs, axis=0)
RDR_df = RDR_df.round(5)
readcounts_df.to_csv(os.path.join(args['out_dir'], 'readcounts.tsv'), sep='\t')
RDR_df.to_csv(os.path.join(args['out_dir'], 'RDR.tsv'), sep='\t')
I refactored the code as shown above. One important point to note is that the readcounts_df
also needs to be resaved; otherwise, the seacon call
step will throw a dimension mismatch error.
variable filtered_stats is not defined in this function.