I'm encountering an error while trying to run the getPeak2Gene function to analyze ATAC-seq and RNA-seq data. The error says that "Duplicate 'row.names' are not allowed." It seems like the issue arises from duplicate row names (gene symbols) in the RNA matrix after reading the data. I would appreciate any suggestions for fixing this issue or handling duplicate row names in a better way.
Environment:
Error Message:
2024-10-10 15:11:27 Remove the gene with all expression value is 0.
.rowNamesDF<-`(x, value = value) error: 'row.names' must be numeric
Additional warning message:
non-unique values when setting 'row.names': '0610010B08Rik', '0610010F05Rik', '0610010K14Rik', '1-Mar', '1-Sep', '10-Mar', '10-Sep', ...
What I Have Tried:
I tried using unique() to remove duplicate row names, but that causes gene information to be lost.
I also considered using make.unique() to add unique identifiers to the row names, but I prefer not to alter the gene names in this way as it may impact the downstream analysis.
What I Want to Solve:
How can I resolve the duplicate row names issue so that getPeak2Gene works correctly?
What is the best practice for handling duplicate gene symbols in RNA matrices?
Are there other recommended ways to ensure that the row names (gene symbols) in my matrix are unique without losing critical information?
Additional Information:
The RNA matrix is correctly read into R, but the issue arises because of duplicate row names (gene symbols).
The files I'm working with are .tsv files, and the RNA-seq data contains TPM normalized expression values.
I'm encountering an error while trying to run the getPeak2Gene function to analyze ATAC-seq and RNA-seq data. The error says that "Duplicate 'row.names' are not allowed." It seems like the issue arises from duplicate row names (gene symbols) in the RNA matrix after reading the data. I would appreciate any suggestions for fixing this issue or handling duplicate row names in a better way. Environment:
Code Example:
Here's the code I'm running: p2g_res <- getPeak2Gene( atac_matrix = "./ATAC_CPM_Norm_Data.tsv", rna_matrix = "./RNA_TPM_Norm_Data.tsv", peak_annotation = anno, max_distance = 50000, N_permutation = 10000, save_path = "./cisDynet_result" )
Error Message: 2024-10-10 15:11:27 Remove the gene with all expression value is 0. .rowNamesDF<-`(x, value = value) error: 'row.names' must be numeric Additional warning message: non-unique values when setting 'row.names': '0610010B08Rik', '0610010F05Rik', '0610010K14Rik', '1-Mar', '1-Sep', '10-Mar', '10-Sep', ...
What I Have Tried:
What I Want to Solve:
Additional Information: