Closed ariutta closed 5 years ago
@ianwdunlop, would this change be able to be integrated into your code OK?
Sounds reasonable, but this is not the full patch, correct? I mean, we need to update the code accordingly too...
Sounds reasonable, but this is not the full patch, correct? I mean, we need to update the code accordingly too...
Yes, definitely. I just made those updates. Now each of the following and any references to them are updated:
datasources.txt datasources_headers.txt organisms.txt DataSourceTxt.java datasourcesTxt IdentifiersOrgDataSource.txt DataSourceTxtTest generatedDatasources.txt
The one exception is DataSourceTxt
in /UPGRADE_NOTES.md. I'm not sure whether that should just be changed to DataSourceTsv
or whether it should be updated to say BioDataSource.init();
or DataSourceTxt.init
should be updated to DataSourceTsv.init();
.
And, I like to see this tested, and not sure right now is the best time... maybe after the 2.3 release? Would that be early enough?
Sure, there's no rush.
I tried to be conservative and so just changed from DataSourceTxt
to DataSourceTsv
(adjusting each term to match source capitalization). But it might make more sense to use BioDataSource
or DataSourcesMetadata
so we aren't tied to a specific file format. I've seen one or both of those terms in the codebase already.
Outdated. We're migrating to .tsv
.
Take a look at this version of our datasources file. It's formatted as a readable, searchable table! GitHub and tools like Tad recognize the filename extension
.tsv
but not.txt
.This change will make it much easier to work with this file, both for using it in other programs and for maintaining it. For one example, take a look at our current organisms file. It appears we have tabs between some but not all of the species names.