ShixiangWang / DoAbsolute

:package: Automate Absolute Copy Number Calling using 'ABSOLUTE' package
Other
36 stars 11 forks source link

Assistance Requested for Interpreting ABSOLUTE Package Results #32

Closed PollyHung closed 7 months ago

PollyHung commented 7 months ago

Dear Dr. Shixiang Wang,

I hope this message finds you well. I would like to express my gratitude for your efforts in maintaining the ABSOLUTE package available, which has been indispensable for my research. I have been utilizing the package and recently executed the RunAbsolute function with the following code:

x <- read.csv(paste0("~/result_df/", i, ".csv"))  
names(x) <- c("Chromosome","Start","End","Num_Probes","Segment_Mean")
write.table(x, "x", sep = "\t", row.names = FALSE)

## run absolute 
RunAbsolute("x", 
              sigma.p=0, 
              max.sigma.h=0.015, min.ploidy=0.95, max.ploidy=10, 
              primary.disease="ov", platform="Illumina_WES", 
              sample.name= i, 
              results.dir="~/MRes_project_1/docs/HH_ova/absolute_output", 
              max.as.seg.count=1500, max.non.clonal=0.05, max.neg.genome=0.005, 
              copy_num_type="total", 
              maf.fn=NULL, min.mut.af=NULL, 
              output.fn.base=NULL, verbose=FALSE)

I was able to obtain the output in the .RData format as well as the corresponding plot, as shown below:

Screenshot 2024-01-15 at 20 01 55

ABSOLUTE_plot.pdf

Unfortunately, I find myself at an impasse due to the limited documentation available, and I am having difficulty interpreting the results, specifically in extracting the purity and ploidy scores.

Would it be possible for you to provide any guidance or insights on how to derive or calculate the purity and ploidy scores from these results? Your expertise would be of great assistance and is sincerely appreciated.

Thank you very much for your time and help.

Best Regards, Polly Hung

ShixiangWang commented 7 months ago

Hi @PollyHung

You just run one of the pipelines to generate the final results. I recommend you just use the DoAbsolute following the README.

https://github.com/ShixiangWang/DoAbsolute/blob/ae20c68792702ff3f144f774b3031b0ab6825b38/R/DoAbsolute.R#L418-L441

For documentation, please see https://www.genepattern.org/analyzing-absolute-data#gsc.tab=0 Some other links maintained by authors of ABSOLUTE seem down.

PollyHung commented 7 months ago

Dear Dr. Shixiang Wang,

I greatly appreciate your prompt and insightful response! I have indeed generated the purity and ploidy score for my samples. I also agree that DoAbsolute would be the superior option for our purposes. Regrettably, due to my current use of FACETS, I could only produce segmentation files, while lacking the capability to generate the requisite maf files.

Would there perhaps be a feasible approach to employ DoAbsolute in the absence of a maf file? Any guidance or alternative suggestions you could provide in this regard would be immensely valuable.

Thank you once again for your valuable time and assistance.

Best Regards, Polly Hung

ShixiangWang commented 7 months ago

Hi, MAF file is optional. https://github.com/ShixiangWang/DoAbsolute/blob/ae20c68792702ff3f144f774b3031b0ab6825b38/R/DoAbsolute.R#L21

But do note that an MAF as an input is recommended.

PollyHung commented 7 months ago

Dear Dr. Shixiang Wang,

I hope this message finds you well.

Thank you for your prompt response to my previous query! I am writing to seek further assistance with the DoAbsolute.R function. Despite following your valuable suggestions, I am encountering a recurring error when processing my samples. The following is an excerpt from the error log for your reference:

--> Processing sample sample_3 ...
[1] "Capping 0 segs at tCR = 5.0"
[1] "Expected copy-ratio = 0.91591"
[1] "Mode flag is NA, not generating plots. Sample has failed ABSOLUTE"
--> Processing sample sample_4 ...
... and so on for each sample ...

This issue persists across all samples, except for the first two, which were processed successfully. I have consulted various discussions, including those on BioStar, where it was recommended to set the copy_num_type parameter to "total". Despite applying this advice, the problem remains unresolved.

I would be immensely grateful for any insights you might offer into the potential causes of this issue and any advice on how to rectify it. Understanding the root of this problem is crucial for the advancement of my work, and I highly value your expertise on the matter.

Thank you for considering my request. I am looking forward to your guidance.

Best Wishes, Polly

ShixiangWang commented 7 months ago

Could you send your data of sample_1, sample_2, sample_3 to my email w_shixiang@163.com? I will take a look.

ShixiangWang commented 7 months ago

It's working with:

library(DoAbsolute)
library(data.table)
# Load Test Data ----------------------------------------------------------

# segmentation file
files = list.files("~/../Downloads/", pattern = "seg", full.names = TRUE)
files
Seg = rbindlist(lapply(files, fread))

Seg
Seg2 = Seg[, list(Sample = ID, Chromosome = chrom, Start = loc.start, End = loc.end, Num_Probes = num.mark, Segment_Mean = seg.mean)]

# test function
DoAbsolute(Seg = Seg2, platform = "Illumina_WES", copy.num.type = "total",
           results.dir = "~/../Downloads/test", keepAllResult = TRUE, verbose = TRUE)

I will send a copy to your email.