AllenInstitute / MOp_taxonomies_ontology

Central location for versioning and sharing of taxonomy files relevant for ontology development as part of cell type cards.
Other
2 stars 0 forks source link

Updated CCN201912131 with BDS version #1

Closed shawntanzk closed 3 years ago

shawntanzk commented 3 years ago

DO NOT MERGE YET

@jeremymiller @raymond-sanchez I've created a branch and replaced it with our version of mouse taxonomy, could you help see if the changes are alright? (It should only be X-like implementation)

Notes: 1) implementation of X-like in new versions 2) notice huge amount of diff in the csv file, need to investigate 3) Also notice that in the json file, the version in main uses cell_set_additional_alias instead of cell_set_additional_aliases; need to check if this affects our pipeline, and if so, need to check on human and marmoset taxonomies. 4) Noticed that csv file in main does not have cell type card column

jeremymiller commented 3 years ago

These changes should all be fine. You are correct that the cell type card column is only in the folder files. We can change cell_set_additional_alias to cell_set_additional_aliases (or vice versa) in any of the input files if desired--no need to edit your pipeline.

shawntanzk commented 3 years ago

@jeremymiller we currently use aliases, so let's use that :) thanks Will merge this branch then

dosumis commented 3 years ago

Large diffs due to csv save settings. Presume you did this in Excel? If using comma delimiters, it is better to double quote in case content of cells has commas (IIRC some columns do have this). Otherwise use a TAB delimiter (internal tabs in cell content are v.rare and generally a v.bad idea). TAB delimiter will also allow for unescaped " in cells (uncommon in my experience, but sometimes needed).

Also not a great idea to merge in content changes when you can't view a simple diff. Format changes should be their own commit.

shawntanzk commented 3 years ago

yeah I did use excel, did not know that made a difference, will be more careful with that in the future. Should we change this to tsv then? not sure whats the best way to do it

dosumis commented 3 years ago

When you export csv from excel, you can select the delimiters and cell quoting.

dosumis commented 3 years ago

Alternatively, you could use Python Pandas as a converter. That can consume Excel too.

dosumis commented 3 years ago

One more (lazy) possibility - copying and pasting from excel straight into the GitHub editor interface => TAB separation.