globalbioticinteractions / scan

Symbiota Collections of Arthropods Network (SCAN) Registry
1 stars 0 forks source link

a selection of all non-matched names in (expected) reported scan species associations #5

Open jhpoelen opened 3 years ago

jhpoelen commented 3 years ago

Inspired by a recent conversation about formatting associatedTaxa values (e.g., "Visiting [some name]" --> "Visiting: [same name]"), I ran the following command on 2020-11-05 to create an exhaustive list of all scan names that, for some reason, could not be linked to taxonomic naming schemes.

The data shared here is by no means a measure of the quality of data in scan, but more a tool to quickly see what kind of association data from SCAN exists that GloBI doesn't quite understand how to read yet.

curl "https://depot.globalbioticinteractions.org/snapshot/target/data/tsv/interactions.tsv.gz" | gunzip | head -n1 | gzip > header.tsv.gz
curl "https://depot.globalbioticinteractions.org/snapshot/target/data/tsv/interactions.tsv.gz" | gunzip | grep "no:match" | grep "globalbioticinteractions/scan" | gzip > rows.tsv.gz
cat header.tsv.gz rows.tsv.gz > no-match-scan-interactions.tsv.gz

examples include Visiting Cleistesiopsis divaricata for record https://scan-bugs.org/portal/collections/individual/index.php?occid=31925934 .

@neilcobb @seltmann please let me know if there's another way I can help provide feedback to SCAN records.

no-match-scan-interactions.tsv.gz

neilcobb commented 3 years ago

@evindunn @jhpoelen @seltmann Evin is out of town but will be back on Monday. I can email all the collections and get their permission to batch edit. Thanks Jorrit

jhpoelen commented 3 years ago

@neilcobb great to hear that you are interested to look beyond the orchid pollinators!

Before jumping into it, would it help to review the attached SCAN records that GloBI doesn't quite understand?

This way, I hope we can make the best use of Evan's and collection managers precious time.

If you like this idea, I'd be happy to do a little more analysis and meet sooner rather than later.

-jorrit

neilcobb commented 3 years ago

@jhpoelen @evindunn

I believe Evin is available anytime between 10:30-12:30 M-F and I am availa any day next week except Wednesday

jhpoelen commented 3 years ago

@neilcobb @evindunn Thursday 12 Nov 11am Eastern / 8am Pacific work best for me. Please confirm.

neilcobb commented 3 years ago

image

neilcobb commented 3 years ago

@jhpoelen @evindunn

I should have provided Evin's schedule before, Thursday 8AM PST does not work

jhpoelen commented 3 years ago

Ok, perhaps I was confused about the timezones. I can meet later that day up until 12pm Pacific. Please propose a suitable time.

jhpoelen commented 3 years ago

@evindunn @neilcobb how about Friday 13 Nov 08:00a Pacific?

neilcobb commented 3 years ago

@jhpoelen @evindunn will have to push back a week or even two, Evin has a couple Symbiota2 deadlines and a computer vision deadline that are critical to meet

jhpoelen commented 3 years ago

@neilcobb - ok sounds good. please propose a tentative date to avoid this important topic from getting dropped.

neilcobb commented 3 years ago

@jhpoelen do you know anyone that knows typescript and Angular that would be willing to work 20-40 hours over the next two months to help Evin? Somebody that we can hire as a temp?

jhpoelen commented 3 years ago

Just heard from @neilcobb that updates to the West Virginia Wesleyan College Katharine B. Gregg Orchid Pollinator Collection might be made late January 2021 or later.

jhpoelen commented 3 years ago

Here's some details on the suggested changes from an earlier email:

... Today, I noticed that there's wealth of flower visiting associations recorded in the West Virginia Wesleyan College Katharine B. Gregg Orchid Pollinator Collection .

For instance, the record https://scan-bugs.org/portal/collections/individual/index.php?occid=31925913 contains an associatedTaxa field value that includes "Visiting Pogonia ophioglossoides" .

Please note that DwC standard suggests to delimit the kind of association with a colon. So, instead of "Visiting Pogonia ophioglossoides", you'd say "Visiting: Pogonia ophioglossoides". Many other collections already do this, and their richly annotated specimen are more easily to find because of it. ...

Screenshot from 2020-12-16 14-21-08