Closed joncison closed 7 years ago
They don't all match the new pattern because many collections contain a space. The ones that are currently in bio.tool and don't match: http://cbs.dtu.dk/services g:Profiler toolkit g:Profiler Rostlab tools Czech Republic Masaryk University SHOW - Structured HOmogeneities Watcher MOdels for Data Analysis and Learning - MODAL Bologna Biocomputing Group MoD Tools EBI Tools ChEBI Tools ChEMBL Tools UniProt Tools Ensembl Tools EMBOSS at EBI Tools PDBe Tools EBI Tools (ENA Tools) Thornton Tools Europe PMC Tools Parkinson Tools Goldman Tools Plant Systems Biology BIG N2N Tel Aviv University BioMedBridges Tools Instruct CCP4 Cell Line Integrated Molecular Authentication database and Identification tool USMI Cell Line Database and Analisys Tools USMI Biological Resources Catalogues Bromberglab tools RostLab tools, PredictProtein Odonoghuelab tools http://galaxyapi.web.pasteur.fr EMBOSS_6.3.1 hmmer_3.0 mview1.49 CBS phylip_3.67 pdb-lib_1.0 blastTaxoAnalysis_1.0 njplot_20051109 ViennaRNA_1.8.4 newick-utils_1.6 Clustal-Omega_1.1.0 blast_2.2.26 ClustalW_2.0.12 taxoptimizer_1.1 squizz_0.99b UiO tools KMUTT tools CBU tools UiB tools BiB tools Debian Med NTNU tools Regulatory Sequence Analysis Tools (RSAT) Segway Suite Institut Pasteur Bioinformatics and Biostatistics Hub GEM Pasteur Bioinformatics and Biostatistics Hub Pasteur Structural Mass Spectrometry and Proteomics EBI Training Tools GO Tools ELIXIR Trainer Tools Rare Disease http://www.pubmedcentral.gov/ Medizinisches Proteom-Center LCC NCBR Animal and Crop Genomics micro-computed tomography
Hmm, so we'd need two elements, similar to how we handle name/ID currently
We can revert / refactor biotoolsSchema accordingly - we don't want to be introducing "_" into the collection tags, when rendered
The old pattern for collection IDs was [A-Za-z0-9_\- _ ~]+
The new pattern is [_a-zA-Z][_\-.0-9a-zA-Z]*
So the following should match the new pattern, although they are way disputable collections. Except _CBS__, they look like either tools or toolkits (see also my mention of toolkits later down): EMBOSS_6.3.1 hmmer_3.0 mview1.49 CBS phylip_3.67 pdb-lib_1.0 blastTaxoAnalysis_1.0 njplot_20051109 ViennaRNA_1.8.4 newick-utils_1.6 Clustal-Omega_1.1.0 blast_2.2.26 ClustalW_2.0.12 taxoptimizer_1.1 squizz_0.99b
@hansioan, if you wouldn't mind, could you please re-generate your list once more with the actual new pattern?
These are total nonsense as either collectionID
or collectionName
, and need to be replaced with what they actually point to | stand for. Luckily only 3 of those exist.
http://www.pubmedcentral.gov/ http://galaxyapi.web.pasteur.fr http://cbs.dtu.dk/services
Some options for handling the space (as I mentioned also in https://github.com/bio-tools/biotoolsSchema/issues/94):
collectionName
attribute, as mentioned right above by @joncison. (N.B. that here it isn't so trivial to keep the collectionID
-collectionName
pairs consistent without going in the direction of the following (option 3.))We need to tackle developments in stages. The end-game is definitely 3. above, to manifest as "Contributor Cards" and/or "Collection Cards", this in the roadmap for next year (https://biotools.sifterapp.com/issues/432). Until then, the simplest of all options is just revert to accepting any tag for collectionID
, but with the above duly noted.
On a related note, it would be very nice to support in bio.tools the relation
element, which would then immediately allow us to relate, e.g.
I'll discuss this with @ekry and @hansioan tomorrow.
@matuskalas @joncison The regenerated list of collectionIDs that don't match the new pattern is: http://cbs.dtu.dk/services g:Profiler toolkit g:Profiler Rostlab tools Czech Republic Masaryk University SHOW - Structured HOmogeneities Watcher MOdels for Data Analysis and Learning - MODAL Bologna Biocomputing Group MoD Tools EBI Tools ChEBI Tools ChEMBL Tools UniProt Tools Ensembl Tools EMBOSS at EBI Tools PDBe Tools EBI Tools (ENA Tools) Thornton Tools Europe PMC Tools Parkinson Tools Goldman Tools Plant Systems Biology BIG N2N Tel Aviv University BioMedBridges Tools Instruct CCP4 Cell Line Integrated Molecular Authentication database and Identification tool USMI Cell Line Database and Analisys Tools USMI Biological Resources Catalogues Bromberglab tools RostLab tools, PredictProtein Odonoghuelab tools http://galaxyapi.web.pasteur.fr UiO tools KMUTT tools CBU tools UiB tools BiB tools Debian Med NTNU tools Regulatory Sequence Analysis Tools (RSAT) Segway Suite Institut Pasteur Bioinformatics and Biostatistics Hub GEM Pasteur Bioinformatics and Biostatistics Hub Pasteur Structural Mass Spectrometry and Proteomics EBI Training Tools GO Tools ELIXIR Trainer Tools Rare Disease http://www.pubmedcentral.gov/ Medizinisches Proteom-Center LCC NCBR Animal and Crop Genomics micro-computed tomography
For now, we revert to simple tags (see https://github.com/bio-tools/biotoolsSchema/issues/79)
Same deal as in https://github.com/bio-tools/biotoolsregistry/issues/284, there's a change in the regex to support (the eventual) use of collection IDs in semantic web applications.
Old pattern: [A-Za-z0-9-~.]+ New pattern: [a-zA-Z][-.0-9a-zA-Z]*
Again, collection tags must now start with underscore or a letter.
@hansioan : will you pls. check ASAP that our existing collection tags satisfy the new pattern - and confirm here?
cc @matuskalas