UcarLab / BiFET

A robust statistical test for TF footprint data analyses
2 stars 2 forks source link

Error in optim(init, fn = loglik_per_TF, gr = deriv_per_TF, lower = 10^(-16), : L-BFGS-B needs finite values of 'fn' #1

Closed rdbcasillas closed 4 years ago

rdbcasillas commented 4 years ago

Hello,

Thanks for creating BiFET.

After creating the two GRanges objects as specified in the tutorial (only difference being that I used HINT instead of PIQ), I came across this error when running the following command.

> result <- calculate_enrich_p(input1_GR, input2_GR)
[1] "PWM 1 out of 1 PWMs"
Error in optim(init, fn = loglik_per_TF, gr = deriv_per_TF, lower = 10^(-16),  : 
  L-BFGS-B needs finite values of 'fn'

I have verified that my inputs look exactly like the inputs GRmotif and GRpeaks provided with the package. The two inputs can be downloaded from this dropbox link.

Any help is greatly appreciated. Thanks!

nlawlor commented 4 years ago

Hi rdbcasillas,

Thanks for your question and for using BiFET! It looks like the input footprinting calls file that you provided are for only one transcription factor motif "KLF6". Currently, BiFET must be supplied with >1 unique transcription factor motifs/PWMs as the function calculate_enrich_p first calculates a "TF binding matrix M where i_th row represents i_th PWM and j_th column represetns j_th peak".

I hope this helps :) Nathan

rdbcasillas commented 4 years ago

Hi Nathan,

Thanks for responding. I added another TF STAT3 but I am getting the same error. Here is what I get this time:

result <- calculate_enrich_p(input1_GR, input2_GR)
[1] "PWM 1 out of 2 PWMs"
Error in optim(init, fn = loglik_per_TF, gr = deriv_per_TF, lower = 10^(-16),  : 
  L-BFGS-B needs finite values of 'fn'

I have updated the dropbox link with new data.

nlawlor commented 4 years ago

Hmm, that's strange. I'm able to run the calculate_enrich_p function using the package test data with 2 and even 1 TF, so please disregard my previous comments about that.

Looks like this could be an issue with the parameters in the optimization function within calculate_enrich_p similar to what is shown here: Stack Overflow

@youna2 Can you please have a look at the provided data?

nlawlor commented 4 years ago

Hi rdbcasillas,

I revisited your input peak file and noticed that your GC content values seemed strange.

# range of GC content values in peaks
range(in1$GC)
[1] 0.000000 0.213918
# how many peaks have a GC content of 0
length(which(in1$GC == 0))
[1] 55433

I'm not sure what species your peak data originated from, but it seems strange that your max GC content is ~21% (seems low; I think the average GC content in human genomes is usually between 40-60%) and that there are 50k peaks with a GC content of exactly zero. Perhaps there were some issues in calculating GC content in your input peak file?

Either way, this is an issue the developers and I will address so that users do not encounter the same error.

For now, a quick fix can be to remove the peaks in your dataset with a GC content equal to zero:

# load input files
in1 <- readRDS("Downloads/BiFet files/input1.rds")
in2 <- readRDS("Downloads/BiFet files/input2.rds")
# identify and remove peaks in peak dataset with GC content equal to zero
id_high <- which(in1$GC > 0)
in1_sel <- in1[id_high]
res <- calculate_enrich_p(in1_sel, in2)

Please give this a try and let me know if you still get the same error!

rdbcasillas commented 4 years ago

Removing rows with GC content equal to zero worked! Thanks Nathan.

I'm not sure what species your peak data originated from

I am playing with mouse data. The peak file was generated by MACS2. I will have to check with other datasets if this low GC content is an anomaly or not.

nevelsk90 commented 4 years ago

Hi,

Thanks for making the tool,

Unfortunately I get the same error even when GC content is not equal zero. Please download R object with input files using the link : https://drive.google.com/open?id=19HZIcR5CGBwDod-Ht0jSnHBsYynGPhJB

I should say that the file containing instances of motifs (GRmotif) is NOT a result of footprint calling but of motif scanning with FIMO. But if I understand correctly BiFET should be able to peocess these results as well.

Thank you in advance and regards TIm