aeeckhou / shallowHRD

This method uses shallow Whole Genome Sequencing (sWGS) and the segmentation of a genomic profile to assess the Homologous Recombination Deficiency of a tumor based on the number of Large-scale Genomic Alterations (LGAs).
30 stars 13 forks source link

full command to run shallowHRD using ichorCNA output file #2

Closed asaki1986 closed 3 years ago

asaki1986 commented 3 years ago

Hi,

I am using shallowHRD to process the output data from ichorCNA.

By running shallowHRD_hg19_ichorCNA.R, it still required the file name ending with .bam_ratio.txt.

I am wondering which file should I provide, WGS1221-L3.100k.seg or WGS1221-L3.100k.seg.txt, since both of them failed to pass the command even I changed the name to WGS1221-L3.100k.seg.bam_ratio.txt

Since shallowHRD had already provided the ichorCNA option, I am wondering which is the right command to get the HRD score.

Thanks, Junfeng

aeeckhou commented 3 years ago

Hello,

There is a little trick to run ichorCNA! You have to adapt the output to the format required by shallowHRD and therefore use two different files of ichorCNA's output : the XXX.correctedDepth.txt and the XXX.seg file (where XXX is the name of your file).

Below you will find a simple script in R to unite them. Then it should work!

readDepth = read.table("XXX.correctedDepth.txt", sep = "\t", header = TRUE)
readDepth = readDepth[which(is.na(readDepth$log2_TNratio_corrected) != TRUE),]

seg = read.table("XXX.seg", sep = "\t", header = TRUE)

readDepth["ratio_median"] <- NA

L = dim(readDepth)[1]
l = dim(seg)[1]

readDepth$chr = as.numeric(readDepth$chr)
readDepth$start = as.numeric(readDepth$start)
readDepth$end = as.numeric(readDepth$end)

seg$chr = as.numeric(seg$chr) 
seg$start = as.numeric(seg$start)
seg$end = as.numeric(seg$end)

head(readDepth)
head(seg)

i=1
j=1

while (i < L+1){
  while (readDepth$chr[i] == seg$chr[j] && readDepth$start[i] >= seg$start[j] && readDepth$end[i] <= seg$end[j]){
    readDepth$ratio_median[i] = seg$median[j]
    i = i +1
  }
  j = j + 1
}

readDepth = readDepth[,-3]
colnames(readDepth) <- c("chromosome", "start", "ratio", "ratio_median")

write.table(readDepth, file = "XXX.bam_ratio.txt", sep = "\t")

Tell me if you manage to make it work.

Best, Alexandre Eeckhoutte

asaki1986 commented 3 years ago

Hello,

There is a little trick to run ichorCNA! You have to adapt the output to the format required by shallowHRD and therefore use two different files of ichorCNA's output : the XXX.correctedDepth.txt and the XXX.seg file (where XXX is the name of your file).

Below you will find a simple script in R to unite them. Then it should work!

readDepth = read.table("XXX.correctedDepth.txt", sep = "\t", header = TRUE)
readDepth = readDepth[which(is.na(readDepth$log2_TNratio_corrected) != TRUE),]

seg = read.table("XXX.seg", sep = "\t", header = TRUE)

readDepth["ratio_median"] <- NA

L = dim(readDepth)[1]
l = dim(seg)[1]

readDepth$chr = as.numeric(readDepth$chr)
readDepth$start = as.numeric(readDepth$start)
readDepth$end = as.numeric(readDepth$end)

seg$chr = as.numeric(seg$chr) 
seg$start = as.numeric(seg$start)
seg$end = as.numeric(seg$end)

head(readDepth)
head(seg)

i=1
j=1

while (i < L+1){
  while (readDepth$chr[i] == seg$chr[j] && readDepth$start[i] >= seg$start[j] && readDepth$end[i] <= seg$end[j]){
    readDepth$ratio_median[i] = seg$median[j]
    i = i +1
  }
  j = j + 1
}

readDepth = readDepth[,-3]
colnames(readDepth) <- c("chromosome", "start", "ratio", "ratio_median")

write.table(readDepth, file = "XXX.bam_ratio.txt", sep = "\t")

Tell me if you manage to make it work.

Best, Alexandre Eeckhoutte

Great, Works!

I changed the script a little to avoid the error.

while (i < L+1)
{
    while (i < L + 1 && readDepth$chr[i] == seg$chr[j] && readDepth$start[i] >= seg$start[j] && readDepth$end[i] <= seg$end[j])
    {
        readDepth$ratio_median[i] = seg$median[j]
        i = i +1
    }
    j = j + 1
}

write.table(readDepth, file = args[3], sep = "\t", quote=F, row.names=F)
aeeckhou commented 3 years ago

Glad it worked!

Good luck on you research!!

Best, Alexandre Eeckhoutte

glscaglione commented 3 years ago

Hi Alexandre,

I ran into the same error using QDNAseq output files(.seg and .txt). I would like to ask you how to adapt the output to use shallowHRD. Thank you in advance.

Best, Gianluca

aeeckhou commented 3 years ago

Hello Gianluca,

You need to adapt the output of QDNAseq to shallowHRD. What I'm doing :

## end of QDNAseq script that I'm using personnaly

copyNumbersSegmented <- segmentBins(copyNumbersSmooth, transformFun = 'none',
                                    alpha = 0.05, nperm = 10000, p.method = "hybrid",
                                    min.width=5, kmax=25, nmin=200, eta=0.05,
                                    trim = 0.025, undo.splits = "sdundo", undo.SD=1)
copyNumbersSegmented <- normalizeSegmentedBins(copyNumbersSegmented)

## save the results of QDNAseq

exportBins(copyNumbersSegmented, file=paste0(outputPath,"/",NAMEEE,".bam_CN"), format="tsv", type=c("copynumber")) 
exportBins(copyNumbersSegmented, file=paste0(outputPath,"/",NAMEEE,".bam_seg"), format="tsv", type=c("segments"))

## Adapt the outputs to a correct format

X = read.table(paste0(outputPath,"/",NAMEEE,".bam_CN"), sep = "\t", header = TRUE)
Y = read.table(paste0(outputPath,"/",NAMEEE,".bam_seg"), sep = "\t", header = TRUE)

X = cbind(X, Y[,5])
X = X[,-4]
X = X[,-1]
head(X)

colnames(X) <- c("chromosome", "start", "ratio", "ratio_median")
write.table(X, file = paste0(outputPath,"/",NAMEEE,".bam_ratio.txt"), sep = "\t", row.names = FALSE)

file.remove(paste0(outputPath,"/",NAMEEE,".bam_CN"))
file.remove(paste0(outputPath,"/",NAMEEE,".bam_seg")) 

The point is to match this format: Chromosome Start Ratio RatioMedian 1 1 -1 -1 1 20001 -1 -1 . . . . . . . .

You can then use the shallowHRD_hg19_QDNAseq.R script provided in the repository and it should work!

Best, Alexandre

glscaglione commented 3 years ago

Fab!

It works perfectly. Thanks a lot Gianluca

Il giorno gio 25 mar 2021 alle ore 09:33 Alexandre Eeckhoutte < @.***> ha scritto:

Hello Gianluca,

You need to adapt the output of QDNAseq to shallowHRD. What I'm doing :

end of QDNAseq script that I'm using personnaly

copyNumbersSegmented <- segmentBins(copyNumbersSmooth, transformFun = 'none', alpha = 0.05, nperm = 10000, p.method = "hybrid", min.width=5, kmax=25, nmin=200, eta=0.05, trim = 0.025, undo.splits = "sdundo", undo.SD=1) copyNumbersSegmented <- normalizeSegmentedBins(copyNumbersSegmented)

save the results of QDNAseq

exportBins(copyNumbersSegmented, file=paste0(outputPath,"/",NAMEEE,".bam_CN"), format="tsv", type=c("copynumber")) exportBins(copyNumbersSegmented, file=paste0(outputPath,"/",NAMEEE,".bam_seg"), format="tsv", type=c("segments"))

Adapt the outputs to a correct format

X = read.table(paste0(outputPath,"/",NAMEEE,".bam_CN"), sep = "\t", header = TRUE) Y = read.table(paste0(outputPath,"/",NAMEEE,".bam_seg"), sep = "\t", header = TRUE)

X = cbind(X, Y[,5]) X = X[,-4] X = X[,-1] head(X)

colnames(X) <- c("chromosome", "start", "ratio", "ratio_median") write.table(X, file = paste0(outputPath,"/",NAMEEE,".bam_ratio.txt"), sep = "\t", row.names = FALSE)

file.remove(paste0(outputPath,"/",NAMEEE,".bam_CN")) file.remove(paste0(outputPath,"/",NAMEEE,".bam_seg"))

And then you can use the shallowHRD_hg19_QDNAseq.R script provided repository and it should work!

Best, Alexandre

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/aeeckhou/shallowHRD/issues/2#issuecomment-806465143, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIHNEU5UVCU6OARQK5Z7BK3TFLYOJANCNFSM4YF66PEQ .