dputhier / pygtftk

A python package and a set of shell commands to handle GTF files
GNU General Public License v3.0
45 stars 6 forks source link

closest_gn_to_feat output format #28

Closed dputhier closed 5 years ago

dputhier commented 5 years ago

Should be centered on gene or peaks. Should provided some flexibility about gene features (tts, tss, body). Is a peak or gene associated with several guys ?

dputhier commented 5 years ago

Fixed in f5a4fa3e47f9af425f0820e6d5904cb3e639eb46 If one wants to select only one peak per gene, use awk (at the moment).

  # get an example dataset
  gtftk get_example -f '*' -d simple
  gtftk closest_gn_to_feat -t tss -r simple_peaks.bed6 -i simple.gtf -c simple.chromInfo -p 10 -n transcript_id,gene_id,gene_name -gu | sort -nk 8,8 | awk 'BEGIN{FS=OFS="\t"; g[0]=""}{if($4 in g == 0){print $0; g[$4]++} }'