sshen82 / BandNorm

Simple Normalization Method for single-cell Hi-C
GNU General Public License v3.0
1 stars 2 forks source link

scGAD scoring on diploid scHiC data #6

Closed tarak77 closed 2 years ago

tarak77 commented 2 years ago

Hi Siqi and Ye, For now, I have ~58 single cell data which are diploid(meaning, unlike naming just chr1, chr2,......we can have chr1(mat), chr1(pat), chr2(mat),chr2(pat),.......) I came across your scGAD scoring method and wanted to try it on my data. With the help of Ye, I was able to get the input files but now that I am trying to implement the script its giving some errors.

  1. scHiC data: I decided to use the bin pair format, and so I have a folder in which there are the 5 column bed files(only nonzero count bin pairs are included due to high resolution of 10kb) for each cell.........something like this: Screen Shot 2022-06-06 at 1 45 22 PM

and each bed file above looks something like this:

Screen Shot 2022-06-06 at 1 53 58 PM
  1. gene annotation file: this file is same as you defined in the tutorial but now each genes gets a double entry since we have maternal(mat) and paternal(pat) chromosomes.........something like this: Screen Shot 2022-06-06 at 1 49 00 PM

so, while running this command gad_score = scGAD(path = "/4cs", genes = geneANNOTATION, depthNorm = TRUE, binPair = TRUE, res=10000) i get the following error

Error in if (input == "" || length(grep("\\n|\\r", input))) { : 
missing value where TRUE/FALSE needed

maybe its due to the way i am giving the input path? but any help will be great!!

sshen82 commented 2 years ago

Hi Tarak, Thanks for using the package! For binA and binB, it seems that you divide the values by 10000. In our scGAD, the binA and binB don't need to divide that number, and it accepts values like 3130000, 3140000, etc. It could be the problem. Let me know if there are still errors!

Yours sincerely, Siqi Shen

tarak77 commented 2 years ago

Hi Siqi, its still giving the same error

would sharing some files with your work?

sshen82 commented 2 years ago

I think the error comes from the fread function for reading the data. This line is

"cell = fread(paths[k], select = c(1, 2, 4, 5))"

, and "paths" comes from "path", by

"paths = list.files(path, full.names = TRUE, recursive = TRUE)"

Could you please directly try the list.files function and see what the output is? It seems that "paths" variable is empty.

If the above still doesn't work, then certainly sharing two cells and the gene annotation file is helpful.

tarak77 commented 2 years ago

ah, i see so when I give the full path, it works! thanks