DailyDreaming / load-project

1 stars 0 forks source link

Trim whitespace from csv value when creating mtx #129

Closed dsotirho-ucsc closed 4 years ago

dsotirho-ucsc commented 4 years ago

Whitespace in csv field values cause extra spaces to be inserted into the space separated mtx file which causes incorrect csv parsing of the csv value.

Eg. projects/GSE81383/geo/GSE81383_data_melanoma_scRNAseq_BT_2015-07-02.txt.gz has lines like:

"AAAS"  " 56.038500"    " 46.539700"    "  0.000000"    " 61.549100"    "  3.808820"    "  0.000000" ...

Which causes projects/GSE81383/matrices/GSE81383_data_melanoma_scRNAseq_BT_2015-07-02.txt.gz/matrix.mtx.gz to have lines like:

%%MatrixMarket matrix coordinate integer general
19019 91 694605
1 1 12.574300
1 2 28.673200
1 3 24.646500
1 4 13.433300
1 5  2.926540
1 7  4.468000
1 8 10.349800

Which causes csv parsing to have 3 columns on some lines, and 4 columns on others (3rd element is empty)