SoundAg / sounDMR

Differentially methylated region analysis from Oxford Nanopore Technologies data
Apache License 2.0
7 stars 1 forks source link

Whole_genome analysis problem #68

Closed PraddyumnaR closed 1 month ago

PraddyumnaR commented 2 months ago

Hi I am facing a issue in Whole genome analysis Chr1 1115 1116 CHG 10 + 1115 1116 255 10 80.00 CHG Chr1 1117 1118 CHG 14 - 1117 1118 255 14 79.00 CHG Chr1 1118 1119 CHG 10 + 1118 1119 255 10 90.00 CHG Chr1 1120 1121 CHG 12 - 1120 1121 255 12 92.00 CHG Chr1 1122 1123 CHG 9 + 1122 1123 255 9 89.00 CHG Chr1 1124 1125 CHG 14 - 1124 1125 255 14 79.00 CHG Chr1 1127 1128 CHH 10 + 1127 1128 255 10 90.00 CHH Chr1 1128 1129 CHH 14 - 1128 1129 255 14 86.00 CHH Chr1 1138 1139 CHH 11 + 1138 1139 255 11 91.00 CHH These are the entries in my bed file what I am confused about is I have successfully ran the below command and I got output folders using split by chromosome

methyl_bed <- list.files(path=".", pattern="*.bed") bedlist <- c() for (i in 1:length(methyl_bed)) {

  • beds <- split_by_chromosome(file.path(getwd(), methyl_bed[i]))
  • bedlist[(length(bedlist) + 1)] <- list(beds)
  • }

now after this step what should I do because I tried running following commands but it didn't work how do I resolve it ?

chrs_list <- unique(str_extract(bedlist,"ch0")) Warning message: In stri_extract_first_regex(string, pattern, opts_regex = opts(pattern)) : argument is not an atomic vector; coercing chrs_list <- unique(str_extract(bedlist,"Chr0")) Warning message: In stri_extract_first_regex(string, pattern, opts_regex = opts(pattern)) : argument is not an atomic vector; coercing chrs_list <- unique(str_extract(bedlist,"Chr*")) Warning message: In stri_extract_first_regex(string, pattern, opts_regex = opts(pattern)) : argument is not an atomic vector; coercing chrs_list <- unique(str_extract(bedlist,"Chr")) Warning message: In stri_extract_first_regex(string, pattern, opts_regex = opts(pattern)) : argument is not an atomic vector; coercing chr_list <- c(Chr1, Chr2, Chr3, Chr4, Chr5, Chr6, Chr7, Chr8, Chr9, Chr10, Chr11, Chr12) Error: object 'Chr1' not found

decibel-tom commented 2 months ago

Are you redefining the chr_list with the line chr_list <- c(Chr1, Chr2, Chr3, Chr4, Chr5, Chr6, Chr7, Chr8, Chr9, Chr10, Chr11, Chr12)? From this R is trying to read Chr1 as a variable, but if it's not defined then it will not be able to read the object. I assume for this if manually defining the chromosome list you should include double quotes such as chr_list <- c("Chr1", "Chr2")

PraddyumnaR commented 2 months ago

chrs_list <- c("Chr1", "Chr2", "Chr3", "Chr4", "Chr5", "Chr6", "Chr7", "Chr8", "Chr9", "Chr10", "Chr11", "Chr12") methyl_bed <- bedlist[grep(chrs_list[1],bedlist)]
Methylframe <- generate_methylframe(methyl_bed_list=methyl_bed, Sample_count = 0, Methyl_call_type="Dorado", filter_NAs = 0, max_read_depth=1000, gene_info = FALSE, gene_coordinate_file = NA, Gene_column=NA, target_info=FALSE, File_prefix="Sample")

I tried this command but during methylframe generation it is giving error:

Error in read.table(file = file, header = header, sep = sep, quote = quote, : 'file' must be a character string or connection

from the readme file it is difficult to know how to proceed further for whole genome analysis. please reply soon and if possible make changes in the readme file.

decibel-tom commented 2 months ago

If you pass in NULL to the gene_coordinate_file argument does that work? Alternatively if you just omit that argument altogether\? I can update the README to reflect that NA does not work here

PraddyumnaR commented 2 months ago

Okay I will try once if that works and tell you because without doing split by chromosome function when I manually gave data only for single chromosome it worked with NA and I was able to do whole analysis. I think I should proceed with same approach.

sachingadakh commented 2 months ago

Okay I will try once if that works and tell you because without doing split by chromosome function when I manually gave data only for single chromosome it worked with NA and I was able to do whole analysis. I think I should proceed with same approach.

Hello PraddyumnaR, I read you were able to do the whole analysis while providing one chromosome manually, I tried the same given I faced the same error as you, but I cannot go further with creating Megaframe such as Methyl_bed and faced the error as mentioned in #69. Can you help me to resolve it? Actually, It is very urgent for me. Further, I am also waiting for a response from the tool's authors. Thank you.

PraddyumnaR commented 2 months ago

Okay I will try once if that works and tell you because without doing split by chromosome function when I manually gave data only for single chromosome it worked with NA and I was able to do whole analysis. I think I should proceed with same approach.

Hello PraddyumnaR, I read you were able to do the whole analysis while providing one chromosome manually, I tried the same given I faced the same error as you, but I cannot go further with creating Megaframe such as Methyl_bed and faced the error as mentioned in #69. Can you help me to resolve it? Actually, It is very urgent for me. Further, I am also waiting for a response from the tool's authors. Thank you.

Hey Sachin, actually I saw the issue and that you recieved reply from authors too also I would suggest you to follow the demo data input as it is any change in input can trigger a error also if there are any duplicate entries with reference to chromosome and position make sure to remove them.

PraddyumnaR commented 2 months ago

If you pass in NULL to the gene_coordinate_file argument does that work? Alternatively if you just omit that argument altogether? I can update the README to reflect that NA does not work here

Hey can you put up a wokflow commands to follow for whole genome analysis because it is difficult to know from readme which commands to execute and their sequence too with the changes because there are some issues where you have mentioned those updated commands but have not updated them in readme.

sachingadakh commented 2 months ago

Okay I will try once if that works and tell you because without doing split by chromosome function when I manually gave data only for single chromosome it worked with NA and I was able to do whole analysis. I think I should proceed with same approach.

Hello PraddyumnaR, I read you were able to do the whole analysis while providing one chromosome manually, I tried the same given I faced the same error as you, but I cannot go further with creating Megaframe such as Methyl_bed and faced the error as mentioned in #69. Can you help me to resolve it? Actually, It is very urgent for me. Further, I am also waiting for a response from the tool's authors. Thank you.

Hey Sachin, actually I saw the issue and that you recieved reply from authors too also I would suggest you to follow the demo data input as it is any change in input can trigger a error also if there are any duplicate entries with reference to chromosome and position make sure to remove them.

Hello PraddyumnaR, Thank you for the heads-up. I removed the duplicated rows, which were tagged with "h," and replaced the dot (.) with + strand in the strand column, and the rest worked perfectly. Now analyzing the results, hope everything goes fine. Thank you again