PriceLab / chip-seq-motif-study

to determine the how TF motifs do and do not match ChIP-seq assays
0 stars 0 forks source link

test_identifyPeaks.R #10

Open paul-shannon opened 4 years ago

paul-shannon commented 4 years ago

@mariam16548 this sketches out the preliminary contract for your new function. questions welcome.

mariam16548 commented 4 years ago

@paul-shannon I've put up identifyPeaks.R in the ctcf folder. Update: I was able to get the function to pass the tests! I think it wasn't working because of the way I used source() in the beginning (I'm not sure if I did it correctly).

I used: start=53160025, end=56170311

paul-shannon commented 4 years ago

@mariam16548 A few things:

Don't hesitate to call me if this is not crystal clear!

mariam16548 commented 4 years ago

Sounds good!

On Aug 26, 2019, at 8:58 PM, Paul Shannon notifications@github.com wrote:

@mariam16548 A few things:

I checked in some changes to identifyPeaks.R and test_identifyPeaks.R I cleaned up some code, but had to stop when, in the test_identifyPeaks() the bamFile variable, the name of the file, does not refer to the rich, representative and reasonably small bam file you should have previously sliced out of the big GSM749704_hg19_wgEncodeUwTfbsGm12878CtcfStdAlnRep1.bam and then pushed to the repo, where I could pull it, and which all the tests could then use. remember that the goal is to create a good test bam file FIRST, which then can be used in all of your tests. Don't hesitate to call me if this is not crystal clear!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

mariam16548 commented 4 years ago

@paul-shannon I've made the following changes:

  1. I added the sliced bam file on github: explore/ctcf/sliceGSM749704_hg19_wgEncodeUwTfbsGm12878CtcfStdAlnRep1.bam
  2. I edited the unit test so that it used an appropriate region (start=53160025, end=56170311) and file name.
  3. I edited the identifyPeaks so that it did not include sliceBamFile() within the code.
  4. I reran the tests (and they passed).
mariam16548 commented 4 years ago

@paul-shannon I've written the script you requested. You can find it here: explore/ctcf/simpleScript_identifyPeaks.R

paul-shannon commented 4 years ago

@mariam16548 identifyPeaks.R has a line of code

source("identifyPeaks.R")

I get the error message:

node stack overflow

Do you think these two things may be related?
(general background question,to look into some time: What, in computer science, is a "stack"?)

mariam16548 commented 4 years ago

@paul-shannon Interesting! I will look into that.

paul-shannon commented 4 years ago

@mariam16548 - You now have the ability to find narrow and broad peaks, using MACS2 via docker, called from R. Excellent. Next up, these tasks:

paul-shannon commented 4 years ago

@mariam16548 I asked above, what is a stack?

How about including this in ongoing background study of data structures?

Do a web search on computer science data structures to get started. I suggest that in your notes file, you make a list of recognized data structures, including an explanation and, in time, an example of each in R. Let this be a slow-paced investigation, one that fills in around, but does not displace, your main focus on the ChIP-seq/motif analysis.

mariam16548 commented 4 years ago

Sounds great!

On Aug 29, 2019, at 6:15 AM, Paul Shannon notifications@github.com wrote:

@mariam16548 I asked above, what is a stack?

How about including this in ongoing background study of data structures?

Do a web search on computer science data structures to get started. I suggest that in your notes file, you make a list of recognized data structures, including an explanation and, in time, an example of each in R. Let this be a slow-paced investigation, one that fills in around, but does not displace, your main focus on the ChIP-seq/motif analysis.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

mariam16548 commented 4 years ago

@paul-shannon

  1. I've put up findBindingSites() onto github (explore/ctcf/findBindingSites.R)
  2. I added findBindingSites() and the rest of the column names to identifyPeaks.R
  3. I added a motif track, renamed the other tracks
  4. I am still working on the notes, I will let you know when they are on github
mariam16548 commented 4 years ago

@paul-shannon So I've added six "interesting spots" to explore/ctcf/igvR_and_identifyPeaks.R. I have also added my notes in github in the "chIP-seq-practice" folder (mariamNotes.txt).

paul-shannon commented 4 years ago

Hi Mariam,

Excellent! Tomorrow (Tuesday) I will restructure the code you have written into an S4 class http://adv-r.had.co.nz/S4.html - which will seem odd at first, but whose beauty you will soon see.

On Sep 1, 2019, at 9:35 PM, mariam16548 notifications@github.com wrote:

@paul-shannon So I've added six "interesting spots" to explore/ctcf/igvR_and_identifyPeaks.R. I have also added my notes in github in the "chIP-seq-practice" folder (mariamNotes.txt).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

mariam16548 commented 4 years ago

Okay, great!

On Mon, Sep 2, 2019 at 6:22 AM Paul Shannon notifications@github.com wrote:

Hi Mariam,

Excellent! Tomorrow (Tuesday) I will restructure the code you have written into an S4 class http://adv-r.had.co.nz/S4.html - which will seem odd at first, but whose beauty you will soon see.

  • Paul

On Sep 1, 2019, at 9:35 PM, mariam16548 notifications@github.com wrote:

@paul-shannon So I've added six "interesting spots" to explore/ctcf/igvR_and_identifyPeaks.R. I have also added my notes in github in the "chIP-seq-practice" folder (mariamNotes.txt).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/PriceLab/chip-seq-motif-study/issues/10?email_source=notifications&email_token=AMNCZQV6W2L5U2VI7GIZTW3QHUHPRA5CNFSM4IPUPOA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5VZ4LI#issuecomment-527146541, or mute the thread https://github.com/notifications/unsubscribe-auth/AMNCZQVO36VVN3C56AFX7HLQHUHPRANCNFSM4IPUPOAQ .