Open seltmann opened 2 years ago
@seltmann suggested to prepend the exact location of the row that the name was extracted from:
e.g.,
line:zip:hash://sha256/05b2680a060f4bccce78b001243b5b6a579fd7c67db10978d14bfac486d38ed5!/occurrence.txt!/L9 | Anolis festae HAS_ACCEPTED_NAME COL:675NP Anolis festae species Biota | Animalia | Chordata | Reptilia | Squamata | Iguania | Dactyloidae | Anolis | Anolis festae COL:5T6MX | COL:N | COL:CH2 | COL:RP | COL:45C | COL:87BW7 | COL:8Y8 | COL:WQP | COL:675NP unranked | kingdom | phylum | class | order | superfamily | family | genus | species https://www.catalogueoflife.org/data/taxon/675NP
suggest to use ucsb-izc as a smaller example to try out the concept, then move to MCZ to get a sense for the performance of the method.
Automatically align the names from a list or dwc-A and align using a catalog (e.g., catalogOfLife).
For example, given this record from MZC:
Would align taxon name Anolis festae with Catalog of Life, resulting in output:
Anolis festae HAS_ACCEPTED_NAME COL:675NP Anolis festae species Biota | Animalia | Chordata | Reptilia | Squamata | Iguania | Dactyloidae | Anolis | Anolis festae COL:5T6MX | COL:N | COL:CH2 | COL:RP | COL:45C | COL:87BW7 | COL:8Y8 | COL:WQP | COL:675NP unranked | kingdom | phylum | class | order | superfamily | family | genus | species https://www.catalogueoflife.org/data/taxon/675NP