PeterUlz / TranscriptionFactorProfiling

Profiling of transcription factor binding sites in cell-free DNA
23 stars 11 forks source link

TFBS position ? #3

Open lvran1234 opened 3 years ago

lvran1234 commented 3 years ago

hello , sorry to disturb you. I saw the paper 'Inference of transcription factor binding from cellfree DNA enables tumor subtype prediction and early detection' said "The position was recalculated by focusing on the reported point where the meta-cluster has the highest ChIP-seq signal. " . I have a question, where can I get the highest ChIP-seq signal. Look forward to your reply.

PeterUlz commented 3 years ago

In the GTRD summary file (http://gtrd.biouml.org/downloads/19.10/chip-seq/Homo%20sapiens_meta_clusters.interval.gz) there are several fields in a text file one of which is the distance from the left-most position to the "summit" of the meta-cluster. This is what the README says:

Columns:
   1. CHROM - the name of the chromosome (e.g. chr1, chr2, ...) 
   2. START - the start position of the metacluster. The first base in a chromosome is numbered 0.
   3. END - the ending position of the metacluster. This base is not included in the metacluster.
      For example, the first 100 bases of a chromosome are described as START=0, END=100, and span the bases numbered 0-99. 
   4. summit - the offset from the START in the metacluster that gives the 'center'.
   5. uniprotId - transcription factor uniprot id
   6. tfTitle - transcription factor title.
   7. cell.set - the list of cells or tissues separated by semicolon.
   8. treatment.set - the list of experimental treatments/conditions separated by semicolon.
   9. exp.set - the list of ChIP-seq experimnets supporting this metacluster.
  10. peak-caller.set - the list of peak callers supporting this metacluster
  11. peak-caller.count - the number of peak callers supporting this metacluster.
  12. exp.count - the number of ChIP-seq experiments supporting this metacluster.
  13. peak.count - the number of ChIP-seq peaks supporitng this metaclusters.

So what I was doing for most of the experiments was to use the START coordinate and add the summit amount of bp to arrive at the supposed "peak" of the ChIP-seq metacluster. I used this exact position as the anchor for summing over many regions, so every time you see -1000bp to +10000bp, this is relative to that exact position

lvran1234 commented 3 years ago

Thank you very much for your reply and patience~