Closed disulfidebond closed 2 years ago
Hi there,
Thanks for trying out our tools. In principle, it should be quite easy to build your copy number file so that it can be used in the hrDetect script. The command line script is just a wrapper for the R function HRDetect_pipeline
, and according to our documentation, the text file for CNV should be a TAB separated file and contain a header in the first line with the following columns: 'seg_no', 'Chromosome', 'chromStart', 'chromEnd', 'total.copy.number.inNormal', 'minor.copy.number.inNormal', 'total.copy.number.inTumour', 'minor.copy.number.inTumour'.
Now the only reason I could think for it not to work, is that perhaps you might have formatted the CNV file to have comma separated values (CSV), which is the typical ASCAT output, but that would not work, as TAB separated is necessary.
Affimetrix data from an array si not necessary. All you should need is to have the data formatted as above, which indicates where the segments start and end, as well as the segment tumour/normal copy numbers. The algorithm will just read the segments and count the LOH segments of a certain size to compute the HRD-LOH index.
Let me know if you still have problems with this. BW, Andrea
Hi Andrea,
Thanks for the reply. I setup breakpoints in the code and found out what was causing the error. The US and UK spelling differ US:tumor,UK:tumour
, and when the code attempted to read the CNV text file with these column header names:
seg_no Chromosome chromStart chromEnd total.copy.number.inNormal minor.copy.number.inNormal total.copy.number.inTumor minor.copy.number.inTumor
it threw the error:
Error in { :
task 2 failed - "arguments imply differing number of rows: 154, 1, 0"
Calls: HRDetect_pipeline -> %dopar% -> <Anonymous>
After completing the dopar
loop in the code for HRDetect.R
, the value for the HRD score was null (formatting applied by me):
completed foreach loop
finished foreach read.table code block
finished foreach read.table code block, result hrd_list is
[[1]] NULL
finished if code block to compute HRD-LOH for samples, data_matrix is |
del.mh.prop | SNV3 | SV3 | SV5 | hrd | SNV8 | |
---|---|---|---|---|---|---|---|
TESTSAMPLE1 | 0.09756098 | 43.13171 | 0 | 0 | NA | 160.8142 |
When I changed the spelling of the headers for the ascat format input CNV file to total.copy.number.inTumour
and minor.copy.number.inTumour
, the error disappeared and it worked correctly.
Best, John
Nice, glad it worked. Andrea
Sorry to comment on a close issue. I have some related questions so I thought it is nice to have them in one thread. I am able to get the ASCAT3.0 to work but the output is not quite the format that HRDetect asks for. I saw the $segments
output contains sample chr startpos endpos nMajor nMinor
. I cannot fine anywhere it has the copy number in normal samples. Should I just assume that the copy number in the normal samples be 2(total) and 1(minor)?
In general yes, for the purpose of HRDetect you can set 2 total and 1 minor and it should work.
Hello, I'm exploring tools for HRD detection and signature analysis for our lab. I tested out HRDetect using
hrDetect.R
via command line, and I was able to format data successfully from our lab to match the required formatting for SNV, Indels, and SV. The required text tables and VCF files for HRDetect were created from VCF output files from DRAGEN and Manta. When I attempted to modify CNV output from DRAGEN to match ASCAT format for HRDetect, it consistently failed with the errortask 1 failed - arguments imply differing number of rows
. Verifying the CNV text file column headers were correct and that the rows had inferred tumor and normal copy number values for the corresponding genomic locations resulted in the same error.When I substituted the provided
test_hrdetect_1.cna.txt
andtest_hrdetect_2.cna.txt
example files as the CNV input for HRDetect along with the other modified lab data, the error vanished and the HRDetect workflow completed without errors. Can we use output from a tool like Sequenza and adapt it to the ASCAT format that HRDetect requires? Or is the only option to use Affymetrix data from an array?Thanks!