cancerit / C-SAR

Data will be added ahead of poster/conference
GNU Affero General Public License v3.0
1 stars 0 forks source link

Handling count matrices from cell lines starting with a number #16

Open dlalexander opened 1 year ago

dlalexander commented 1 year ago

C-SAR halts for cell lines whose name starts with a number at the read_sample_count_matrix.R stage of RCRISPR, as a result of the function read_count_matrix_file() adding an 'X' onto the start of these cell line names causing a mismatch with the sample annotation. Gives this error:

Command error:
  Reading library annotation...
  Reading sample metadata...
  Reading count matrix...
  Comparing count matrix to library...
  Reordering count matrix...
  Error in reorder_count_matrix_by_sample_type(sample_count_matrix, sample_metadata_object) : 
    Cannot reorder count matrix, sample name not in metadata: X921.D42A
  Execution halted

The command executed is: Rscript /opt/wsi-t113/c-sar/submodules/rcrispr/exec/read_sample_count_matrix.R -c sample_counts.txt -l singles_library.combined.filt.txt -i sample_manifest.txt --outdir "." --count_matrix_outfile "count_matrix.tsv" --library_outfile "library.processed.tsv" --rdata "counts2matrix.Rdata" --counts_delim " " --count_id_column_index 1 --count_gene_column_index 2 --count_count_column_index 3,4,5,6,7,8,9 --library_delim " " --library_id_column_index 1 --library_gene_column_index 3 --info_delim " " --info_filename_column_index 1 --info_label_column_index 2 --info_plasmid_column_index 3 --info_control_column_index 4 --info_treatment_column_index 5

Test data is attached. sample_manifest.txt sample_counts.txt singles_library.combined.filt.txt

vaofford commented 1 year ago

RCRISPR release 1.2.2.0 allows numeric sample names in count and fold change matrices https://github.com/cancerit/RCRISPR/releases/tag/1.2.2.0

vaofford commented 1 year ago

C-SAR version 1.3.7 contains RCRISPR 1.2.2.0