rr1859 / R.4Cker

MIT License
16 stars 15 forks source link

error when running transAnalysis.R #2

Open dzisis opened 8 years ago

dzisis commented 8 years ago

Hi, I tried to run the transAnalysis.R with my own dataset but it returned the following error message : [1] "Normalizing counts..." [1] "Generating synthetic samples...." Error in fitdistr(int_low[, 1], "normal", lower = 0.001) : 'x' must be a non-empty numeric vector Here is the object that i created for my data set:

my_obj = createR4CkerObjectFromFiles(files = c("~/Downloads/R.4Cker-master/bedGraph_files/Unique_counts1_5leng100.bedGraph",
                                               "~/Downloads/R.4Cker-master/bedGraph_files/Unique_counts1_6leng100.bedGraph"),
                                     bait_chr="chr5",
                                     bait_coord= 3178620,
                                     bait_name = "FLC",
                                     primary_enz = "AGATCT",
                                     samples = c("Unique_counts1_5leng100","Unique_counts1_6leng100"),
                                     conditions = "VER",
                                     replicates = 2,
                                     species = "at",
                                     output_dir = "~/Downloads/R.4Cker-master/VER_results_R4Cker2/")

When i am running near bait analysis everything is fine . nb_results=nearBaitAnalysis(my_obj,k=10) Any ideas how to fix this? Is it because i want to use the tool in Anabidopsis Thaliana data ? Thank you in advance for your help

Best regards Dimitrios

dzisis commented 8 years ago

I designed a different object like that :

my_obja = createR4CkerObjectFromFiles(files = c("~/Downloads/R.4Cker-master/bedGraph_files/Unique_counts1_3leng100.bedGraph",
                                              "~/Downloads/R.4Cker-master/bedGraph_files/Unique_counts1_4leng100.bedGraph",
                                               "~/Downloads/R.4Cker-master/bedGraph_files/Unique_counts1_5leng100.bedGraph",
                                               "~/Downloads/R.4Cker-master/bedGraph_files/Unique_counts1_6leng100.bedGraph"),
                                     bait_chr="chr5",
                                     bait_coord= 3178620,
                                     bait_name = "FLC",
                                     primary_enz = "AGATCT",
                                     samples = c("1_3", "1_4", "1_5", "1_6"),
                                     conditions = c("A","B"),
                                     replicates = c(2,2),
                                     species = "at",
                                     output_dir = "~/Downloads/R.4Cker-master/VER_results_R4CkerONECONDtest/")

But again every time i am trying to run the trans or the cis analysis i have an error like :

> cis_results=cisAnalysis(my_obja,k=10)
[1] "Building adaptive windows..."
[1] "Normalizing counts..."
[1] "Generating synthetic samples...."
Error in `colnames<-`(`*tmp*`, value = c("counts", "distance")) : 
  'names' attribute [2] must be the same length as the vector [0]
> trans_results=transAnalysis(my_obja,k=20)
[1] "Building adaptive windows..."
[1] "Normalizing counts..."
[1] "Generating synthetic samples...."
Error in fitdistr(int_low[, 1], "normal", lower = 0.001) : 
  'x' must be a non-empty numeric vector

i tried also with different objects without multiple conditions but always only the neaBaitAnalysis will work at the best scenario Could you please give me some ideas about this kind of problem ? Is it because of my data ??? Thank you Dimitris

rr1859 commented 8 years ago

Hi,

I have not tested the method with arabidopsis. I would be curious to a bedGraph file of your data. If you dont mind can you send me a bedGraph file for one of your samples?

Ramya

dzisis commented 8 years ago

Hi, Thank you for your reply. i will attach here the 2 bedGraph files that i have from my pilot experiment. As because this is a pilot experiment we don't have replications . Just rename the txt files to bedGraph :)

Best Dimitris

Unique_counts1_6leng100.txt Unique_counts1_5leng100.txt

rr1859 commented 8 years ago

Thanks for sending me the files. We use the DESeq method of normalization which requires taking the geometric mean of the rows and I did not account for scenarios where there are zeros - so I added a +1 to the matrix of counts when calculating the size factor. I ran the two files you sent me for cis and trans analysis and they seem to be working fine. Please download the package again and have a try. Let me know if it works!

dzisis commented 8 years ago

Thank you for the reply . Ok i downloaded the package again i tried it and it works for the object with 2 data sets in 2 replications . I tried later to run it again for a different object for each dataset like that:

my_obja = createR4CkerObjectFromFiles(files = c("~/Downloads/R.4Cker-master/bedGraph_files/estimate_counts1_6leng100.bedGraph"),
                                     bait_chr="chr5",
                                     bait_coord= 3178620,
                                     bait_name = "FLC",
                                     primary_enz = "AGATCT",
                                     samples = c("estimate_counts1_6"),
                                     conditions = "A",
                                     replicates = 1,
                                     species = "at",
                                     output_dir = "~/Downloads/R.4Cker-master/estimate_1_6/")

Both near the bait and cis analysis are working fine but again in trans analysis there is an error like that :

> trans_results=transAnalysis(my_obja,k=20)
[1] "Building adaptive windows..."
[1] "Normalizing counts..."
Error in apply(x + 1, 1, div_gm) : dim(X) must have a positive length

Best Dimitris

rr1859 commented 8 years ago

Is this the same bait as the other samples? I would recommend running all samples with the same bait together. We generate synthetic samples by shuffling the windows between samples and use this to train the HMM and then test on your actual samples. When you only have one sample you are essentially training and testing the model on the same dataset which can lead to problems of over-fitting.

In any case, I have added the option to run the program with 1 replicate now even for trans.

dzisis commented 8 years ago

Hi, Yhea i understood that we need synthetic samples but i was trying to get normalized results in order to be able to test them or compare them with other results from other methods. Yes those 2 samples have the same bait.After your changes i managed to run all steps for any kind of analysis. Thank you very much for your help and collaboration. Best Dimitris