Closed Katherine-Kelly closed 8 months ago
Thank you for the message. Currently epiAneufinder takes as input only bam or fragment files. But we can add a function include count matrix input. Can you please tell us what kind of count matrix you have (bins, peaks?).
Thanks, that would be very helpful - I would be working with a peak counts matrix.
Hi! Any updates on this?
Hi, we have created a new function that uses as input a count matrix and we are currently testing the performance, since we have an even more sparse matrix as input. We plan to release it next week. I will keep you posted in this thread.
Hi all, in the dev branch is a new implementation that takes as input also a count matrix. The input should be in 10X format (3 files, the barcodes.tsv, peaks.bed and matrix.mtx with this naming for now). We compared the results of CNV calling with the peak matrix compared to fragment file as input. We used the SNU601 dataset as in the publication and compared with the scWGS dataset as groundtruth. The correlation between the CNVs is worse when using the matrix as input (0.74 from 0.85). We loose reads that are not in peak regions so overall we get less coverage over the genome. The result is that we can miss some gains and losses that would otherwise have been identified with the fragment input. So we would suggest a bit of caution in the interpretation of the results, but as you can see from the comparison with the groundtruth the method still gives valid results. Try it out and please give us any feedback that you have. We will push it in the main branch in a couple of days, when we finalize the vignette.
Best, Katia and Katharina
Nice, I tested and it works. Thanks!
The code has been merged to the main branch. I am closing the issue, since everything seems to be in order.
I am using some public data where fragments files are not available - is it possible to run epiAneufinder using only a counts matrix or is the fragments file required?