Closed knaegle closed 1 year ago
@alekhyaa2 I fixed the Accession:GeneName from existing code. However, we also want to drop the brackets in this field and the explicit string. I.e. we want this to read like the following: SH3:4:80;Rho-GAP:109:295;SH2 1:330:425;SH2 2:622:716 Instead of this: ['SH3:4:80', 'Rho-GAP:109:295', 'SH2 1:330:425', 'SH2 2:622:716']
To fix one of the things, we should move back to ; separated fields to avoid the need for string grouping in a CSV (i.e. let's not use commas to separate things inside fields of a single column.
Is your feature request related to a problem? Please describe. Right now the domain reference (CSV) file is repeating the accession and gene name as part of the field of boundaries. This makes it hard to read and increases the number of fields that need to be parsed later.
Describe the solution you'd like I suggest we remove these from the boundaries field.
Tasks
Include specific tasks in the order they need to be done in. Include links to specific lines of code where the task should happen at.