bat-literature / bat-literature.github.io

The Bat Literature Project aims to facilitate discovery of scientific literature on bats (Chiroptera)
Creative Commons Zero v1.0 Universal
0 stars 1 forks source link

request to review v0.5 BatLit sandbox records #23

Open jhpoelen opened 2 months ago

jhpoelen commented 2 months ago

See https://sandbox.zenodo.org/communities/batlit-review-md5-26f7ce5dd404e33c6570edd4ba250d20 for records I've generated from BatLit v0.5 https://linker.bio/hash://md5/26f7ce5dd404e33c6570edd4ba250d20 corpus. See also https://batlit.org

Please submit your review comments by Wednesday 28 Aug 2024.

image

n8upham commented 2 months ago

Hey @jhpoelen -- super cool, in an effort to do a full data review, I installed preston v0.8.6 on my own machine like so from here: sudo sh -c '(curl -L https://github.com/bio-guoda/preston/releases/download/0.8.6/preston.jar) > /usr/local/bin/preston && chmod +x /usr/local/bin/preston && preston config-manpage' && preston version

But then I encountered an initial error using preston for recalling the BatList datasets: preston cat hash://md5/26f7ce5dd404e33c6570edd4ba250d20 yields this error:

java.io.IOException: problem retrieving [hash://md5/26f7ce5dd404e33c6570edd4ba250d20] at bio.guoda.preston.cmd.ContentQueryUtil.getContent(ContentQueryUtil.java:59) at bio.guoda.preston.cmd.ContentQueryUtil.copyContent(ContentQueryUtil.java:33) at bio.guoda.preston.cmd.CmdGet.run(CmdGet.java:69) at bio.guoda.preston.cmd.CmdGet.run(CmdGet.java:49) at bio.guoda.preston.cmd.CmdGet.run(CmdGet.java:44) at picocli.CommandLine.executeUserObject(CommandLine.java:1939) at picocli.CommandLine.access$1300(CommandLine.java:145) at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2358) at picocli.CommandLine$RunLast.handle(CommandLine.java:2352) at picocli.CommandLine$RunLast.handle(CommandLine.java:2314) at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2179) at picocli.CommandLine$RunLast.execute(CommandLine.java:2316) at picocli.CommandLine.execute(CommandLine.java:2078) at bio.guoda.preston.Preston.run(Preston.java:103) at bio.guoda.preston.Preston.main(Preston.java:94) Caused by: bio.guoda.preston.store.DereferenceException: failed to dereference [hash://md5/26f7ce5dd404e33c6570edd4ba250d20] at bio.guoda.preston.store.AliasDereferencer.dereferenceAliasedHash(AliasDereferencer.java:94) at bio.guoda.preston.store.AliasDereferencer.get(AliasDereferencer.java:46) at bio.guoda.preston.store.AliasDereferencer.get(AliasDereferencer.java:18) at bio.guoda.preston.cmd.ContentQueryUtil.getContent(ContentQueryUtil.java:51) ... 14 more Caused by: bio.guoda.preston.store.DereferenceException: failed to dereference [hash://md5/26f7ce5dd404e33c6570edd4ba250d20] at bio.guoda.preston.store.ContentHashDereferencer.get(ContentHashDereferencer.java:25) at bio.guoda.preston.store.ContentHashDereferencer.get(ContentHashDereferencer.java:10) at bio.guoda.preston.store.AliasDereferencer.dereferenceAliasedHash(AliasDereferencer.java:92) ... 17 more Caused by: java.io.IOException: cannot find content identified by [hash://md5/26f7ce5dd404e33c6570edd4ba250d20] at bio.guoda.preston.stream.ContentStreamFactory.create(ContentStreamFactory.java:69) at bio.guoda.preston.store.ContentHashDereferencer.get(ContentHashDereferencer.java:23) ... 19 more java.lang.RuntimeException: java.io.IOException: problem retrieving [hash://md5/26f7ce5dd404e33c6570edd4ba250d20] at bio.guoda.preston.cmd.CmdGet.run(CmdGet.java:52) at bio.guoda.preston.cmd.CmdGet.run(CmdGet.java:44) at picocli.CommandLine.executeUserObject(CommandLine.java:1939) at picocli.CommandLine.access$1300(CommandLine.java:145) at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2358) at picocli.CommandLine$RunLast.handle(CommandLine.java:2352) at picocli.CommandLine$RunLast.handle(CommandLine.java:2314) at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2179) at picocli.CommandLine$RunLast.execute(CommandLine.java:2316) at picocli.CommandLine.execute(CommandLine.java:2078) at bio.guoda.preston.Preston.run(Preston.java:103) at bio.guoda.preston.Preston.main(Preston.java:94) Caused by: java.io.IOException: problem retrieving [hash://md5/26f7ce5dd404e33c6570edd4ba250d20] at bio.guoda.preston.cmd.ContentQueryUtil.getContent(ContentQueryUtil.java:59) at bio.guoda.preston.cmd.ContentQueryUtil.copyContent(ContentQueryUtil.java:33) at bio.guoda.preston.cmd.CmdGet.run(CmdGet.java:69) at bio.guoda.preston.cmd.CmdGet.run(CmdGet.java:49) ... 11 more Caused by: bio.guoda.preston.store.DereferenceException: failed to dereference [hash://md5/26f7ce5dd404e33c6570edd4ba250d20] at bio.guoda.preston.store.AliasDereferencer.dereferenceAliasedHash(AliasDereferencer.java:94) at bio.guoda.preston.store.AliasDereferencer.get(AliasDereferencer.java:46) at bio.guoda.preston.store.AliasDereferencer.get(AliasDereferencer.java:18) at bio.guoda.preston.cmd.ContentQueryUtil.getContent(ContentQueryUtil.java:51) ... 14 more Caused by: bio.guoda.preston.store.DereferenceException: failed to dereference [hash://md5/26f7ce5dd404e33c6570edd4ba250d20] at bio.guoda.preston.store.ContentHashDereferencer.get(ContentHashDereferencer.java:25) at bio.guoda.preston.store.ContentHashDereferencer.get(ContentHashDereferencer.java:10) at bio.guoda.preston.store.AliasDereferencer.dereferenceAliasedHash(AliasDereferencer.java:92) ... 17 more Caused by: java.io.IOException: cannot find content identified by [hash://md5/26f7ce5dd404e33c6570edd4ba250d20] at bio.guoda.preston.stream.ContentStreamFactory.create(ContentStreamFactory.java:69) at bio.guoda.preston.store.ContentHashDereferencer.get(ContentHashDereferencer.java:23) ... 19 more

Any ideas where I went wrong?

jhpoelen commented 2 months ago

@n8upham Thanks for your message and for taking the time to share your description.

To print (or 'cat') the most recent provenance log with content id hash://md5/26f7ce5dd404e33c6570edd4ba250d20, you'd have to add a "remote" to let preston know that the resource may be available elsewhere.

Here's one way to do it:

preston cat --remote https://linker.bio hash://md5/.,,

And, please note that only the metadata is available, not the pdfs. If you'd like to have the full datasets (including the pdfs) and you are comfortable using ssh, I can create an account for you and you can grab the full corpus.

Alternatively, send me a self-addressed hard disk with return postage/ and I'll send it by USPS. Note that a 128GB thumbdrive should be more than enough.

n8upham commented 2 months ago

OK nice! I've got this part working now -- thanks for that tip. So when I write: preston cat --remote https://linker.bio hash://md5/26f7ce5dd404e33c6570edd4ba250d20 I get a download of a 62.1 Mb file named "26f7ce5dd404e33c6570edd4ba250d20" that is nested in folders of /data/26/f7/

So then I can just do preston cat hash://md5/26f7ce5dd404e33c6570edd4ba250d20 directly and start grepping things out of it -- e.g., preston cat hash://md5/26f7ce5dd404e33c6570edd4ba250d20 | grep "items[?]" | grep hasVersion | wc -l yields 658 records. So far so good.

But then when I go to the next level of preston cat hash://md5/26f7ce5dd404e33c6570edd4ba250d20 | grep "items[?]" | grep hasVersion | preston cat | jq -c .[] | head -n1 | jq . I get an error with the second calling of preston, as below --

zsh: no matches found: .[] zsh: command not found: jq java.io.IOException: problem retrieving [hash://md5/00e40ec6aae2408f289ef11b3d803994] at bio.guoda.preston.cmd.ContentQueryUtil.getContent(ContentQueryUtil.java:59) at bio.guoda.preston.cmd.ContentQueryUtil.copyContent(ContentQueryUtil.java:33) at bio.guoda.preston.cmd.ContentQueryUtil.copyMostRecentContent(ContentQueryUtil.java:22) at bio.guoda.preston.cmd.CmdGet.run(CmdGet.java:64) at bio.guoda.preston.cmd.CmdGet.run(CmdGet.java:49) at bio.guoda.preston.cmd.CmdGet.run(CmdGet.java:44) at picocli.CommandLine.executeUserObject(CommandLine.java:1939) at picocli.CommandLine.access$1300(CommandLine.java:145) at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2358) at picocli.CommandLine$RunLast.handle(CommandLine.java:2352) at picocli.CommandLine$RunLast.handle(CommandLine.java:2314) at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2179) at picocli.CommandLine$RunLast.execute(CommandLine.java:2316) at picocli.CommandLine.execute(CommandLine.java:2078) at bio.guoda.preston.Preston.run(Preston.java:103) at bio.guoda.preston.Preston.main(Preston.java:94) Caused by: bio.guoda.preston.store.DereferenceException: failed to dereference [hash://md5/00e40ec6aae2408f289ef11b3d803994] at bio.guoda.preston.store.AliasDereferencer.dereferenceAliasedHash(AliasDereferencer.java:94) at bio.guoda.preston.store.AliasDereferencer.get(AliasDereferencer.java:46) at bio.guoda.preston.store.AliasDereferencer.get(AliasDereferencer.java:18) at bio.guoda.preston.cmd.ContentQueryUtil.getContent(ContentQueryUtil.java:51) ... 15 more Caused by: bio.guoda.preston.store.DereferenceException: failed to dereference [hash://md5/00e40ec6aae2408f289ef11b3d803994] at bio.guoda.preston.store.ContentHashDereferencer.get(ContentHashDereferencer.java:25) at bio.guoda.preston.store.ContentHashDereferencer.get(ContentHashDereferencer.java:10) at bio.guoda.preston.store.AliasDereferencer.dereferenceAliasedHash(AliasDereferencer.java:92) ... 18 more Caused by: java.io.IOException: cannot find content identified by [hash://md5/00e40ec6aae2408f289ef11b3d803994] at bio.guoda.preston.stream.ContentStreamFactory.create(ContentStreamFactory.java:69) at bio.guoda.preston.store.ContentHashDereferencer.get(ContentHashDereferencer.java:23) ... 20 more java.lang.RuntimeException: java.io.IOException: problem retrieving [hash://md5/00e40ec6aae2408f289ef11b3d803994] at bio.guoda.preston.cmd.CmdGet.run(CmdGet.java:52) at bio.guoda.preston.cmd.CmdGet.run(CmdGet.java:44) at picocli.CommandLine.executeUserObject(CommandLine.java:1939) at picocli.CommandLine.access$1300(CommandLine.java:145) at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2358) at picocli.CommandLine$RunLast.handle(CommandLine.java:2352) at picocli.CommandLine$RunLast.handle(CommandLine.java:2314) at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2179) at picocli.CommandLine$RunLast.execute(CommandLine.java:2316) at picocli.CommandLine.execute(CommandLine.java:2078) at bio.guoda.preston.Preston.run(Preston.java:103) at bio.guoda.preston.Preston.main(Preston.java:94) Caused by: java.io.IOException: problem retrieving [hash://md5/00e40ec6aae2408f289ef11b3d803994] at bio.guoda.preston.cmd.ContentQueryUtil.getContent(ContentQueryUtil.java:59) at bio.guoda.preston.cmd.ContentQueryUtil.copyContent(ContentQueryUtil.java:33) at bio.guoda.preston.cmd.ContentQueryUtil.copyMostRecentContent(ContentQueryUtil.java:22) at bio.guoda.preston.cmd.CmdGet.run(CmdGet.java:64) at bio.guoda.preston.cmd.CmdGet.run(CmdGet.java:49) ... 11 more Caused by: bio.guoda.preston.store.DereferenceException: failed to dereference [hash://md5/00e40ec6aae2408f289ef11b3d803994] at bio.guoda.preston.store.AliasDereferencer.dereferenceAliasedHash(AliasDereferencer.java:94) at bio.guoda.preston.store.AliasDereferencer.get(AliasDereferencer.java:46) at bio.guoda.preston.store.AliasDereferencer.get(AliasDereferencer.java:18) at bio.guoda.preston.cmd.ContentQueryUtil.getContent(ContentQueryUtil.java:51) ... 15 more Caused by: bio.guoda.preston.store.DereferenceException: failed to dereference [hash://md5/00e40ec6aae2408f289ef11b3d803994] at bio.guoda.preston.store.ContentHashDereferencer.get(ContentHashDereferencer.java:25) at bio.guoda.preston.store.ContentHashDereferencer.get(ContentHashDereferencer.java:10) at bio.guoda.preston.store.AliasDereferencer.dereferenceAliasedHash(AliasDereferencer.java:92) ... 18 more Caused by: java.io.IOException: cannot find content identified by [hash://md5/00e40ec6aae2408f289ef11b3d803994] at bio.guoda.preston.stream.ContentStreamFactory.create(ContentStreamFactory.java:69) at bio.guoda.preston.store.ContentHashDereferencer.get(ContentHashDereferencer.java:23) ... 20 more

Any ideas there?

jhpoelen commented 2 months ago

@n8upham Nice! Glad to see you got the provenance log. . .

For getting the associated Zotero metadata records, you may want to include the --remote option to the preston cat command in your workflow.

e.g.,

preston cat --remote https://linker.bio hash://md5/26f7ce5dd404e33c6570edd4ba250d20 | grep "items[?]" | grep hasVersion | preston cat --remote https://linker.bio | jq -c .[] | head -n1 | jq .
jhpoelen commented 2 months ago

PS. I do realize the error messaging is a bit verbose and nerdy. . . sorry about that. . . please let me know if you have any suggestion on what kind of error messages you'd like to see instead.

PS2. Another way to get all the batlit metadata is to clone the bat-literature repo -

git clone https://github.com/bat-literature/bat-literature.github.io 
n8upham commented 2 months ago

Ah okay nice -- yeah this is working now. That makes sense that the 2nd calling of preston cat also requires the remote tag in order to be downloading those 658 individual records.

Got those downloaded now, which created a bunch of additional folders in my /data/ directory -- 224 in total, which is fewer than the 658 that I was expecting, but then I realized that several folder have multiple subfolders, so all good.

Then I realized that I don't have jq installed -- so did that brew install jq successfully.

But then this code preston cat hash://md5/26f7ce5dd404e33c6570edd4ba250d20 | grep "items[?]" | grep hasVersion | preston cat | jq -c .[] | head -n1 | jq . yields zsh: no matches found: .[] rather than the expected JSON entries

Any ideas there? Maybe not critical since it looks like I indeed have all the metadata in line -- as the command preston cat hash://md5/26f7ce5dd404e33c6570edd4ba250d20 | grep "items[?]" | grep hasVersion | preston cat | head -n10 returns the first few entries

What type of further data review are you looking for?

jhpoelen commented 2 months ago

I just ran

preston cat hash://md5/26f7ce5dd404e33c6570edd4ba250d20 | grep "items[?]" | grep hasVersion | preston cat | jq -c .[] | head -n1 | jq .

and this produced:

{
  "key": "NYT92CCF",
  "version": 48777,
  "library": {
    "type": "group",
    "id": 5435545,
    "name": "Bat Literature Project",
    "links": {
      "alternate": {
        "href": "https://www.zotero.org/groups/bat_literature_project",
        "type": "text/html"
      }
    }
  },
  "links": {
    "self": {
      "href": "https://api.zotero.org/groups/5435545/items/NYT92CCF",
      "type": "application/json"
    },
    "alternate": {
      "href": "https://www.zotero.org/groups/bat_literature_project/items/NYT92CCF",
      "type": "text/html"
    },
    "attachment": {
      "href": "https://api.zotero.org/groups/5435545/items/3MVRRMR8",
      "type": "application/json",
      "attachmentType": "application/pdf",
      "attachmentSize": 388576
    }
  },
  "meta": {
    "createdByUser": {
      "id": 13229919,
      "username": "acsherman",
      "name": "",
      "links": {
        "alternate": {
          "href": "https://www.zotero.org/acsherman",
          "type": "text/html"
        }
      }
    },
    "creatorSummary": "Thong et al.",
    "parsedDate": "2010-10-14",
    "numChildren": 1
  },
  "data": {
    "key": "NYT92CCF",
    "version": 48777,
    "itemType": "journalArticle",
    "title": "Further records of Murina tiensa from Vietnam with first information on its echolocation calls.",
    "creators": [
      {
        "creatorType": "author",
        "firstName": "Vu Dinh",
        "lastName": "Thong"
      },
      {
        "creatorType": "author",
        "firstName": "Christian",
        "lastName": "Dietz"
      },
      {
        "creatorType": "author",
        "firstName": "Annette",
        "lastName": "Denzinger"
      },
      {
        "creatorType": "author",
        "firstName": "Paul J. J.",
        "lastName": "Bates"
      },
      {
        "creatorType": "author",
        "firstName": "Neil M.",
        "lastName": "Furey"
      },
      {
        "creatorType": "author",
        "firstName": "Gabor",
        "lastName": "Csorba"
      },
      {
        "creatorType": "author",
        "firstName": "Glenn",
        "lastName": "Hoye"
      },
      {
        "creatorType": "author",
        "firstName": "Le Dinh",
        "lastName": "Thuy"
      },
      {
        "creatorType": "author",
        "firstName": "Hans-Ulrich",
        "lastName": "Schnitzler"
      }
    ],
    "abstractNote": "The fairy tube-nosed bat, Murina tiensa, is considered to be endemic to Vietnam. It is known only from the original description, when it was found at two localities in limestone karst areas. In 2008, we conducted a series of intensive field surveys throughout the country and obtained additional records of this species from various habitats, including degraded to nearly pristine forests and an offshore island. Our results indicate that M. tiensa is a sexually dimorphic species, females being considerably larger than males in all external and craniodental measurements. The species emits broadband, downward frequency-modulated echolocation calls with a dominant first harmonic. When handheld or when flying in a flight tent, signals had a similar structure and were emitted in groups of 2–4 signals. On average, signals swept from 150 to 49 kHz in 2.2 ms for handheld bats, and, from 145 to 50 kHz in 1.9 ms for flying bats. M. tiensa often occurred in sympatry with M. cyclotis and several rhinolophids.",
    "publicationTitle": "Hystrix, the Italian Journal of Mammalogy",
    "volume": "22",
    "issue": "1",
    "pages": "",
    "date": "October 14, 2010",
    "series": "",
    "seriesTitle": "",
    "seriesText": "",
    "journalAbbreviation": "",
    "language": "en",
    "DOI": "10.4404/hystrix-22.1-4533",
    "ISSN": "18255272, 03941914",
    "shortTitle": "",
    "url": "https://doi.org/10.4404/hystrix-22.1-4533",
    "accessDate": "2024-06-28T00:31:16Z",
    "archive": "",
    "archiveLocation": "",
    "libraryCatalog": "DOI.org (CSL JSON)",
    "callNumber": "",
    "rights": "",
    "extra": "",
    "tags": [],
    "collections": [
      "UAWY6DNP"
    ],
    "relations": {
      "dc:replaces": "http://zotero.org/groups/5435545/items/NF6R8YCX"
    },
    "dateAdded": "2024-07-08T02:34:45Z",
    "dateModified": "2024-08-16T13:50:15Z"
  }
}

So, unfortunately, I was unable to reproduce your result.

This hints to a workflow that uses tools that are slightly different.

Can you please confirm that

preston cat hash://md5/26f7ce5dd404e33c6570edd4ba250d20 | grep "items[?]" | grep hasVersion | head -1

produces:

<https://api.zotero.org/groups/5435545/items?start=0&limit=100> <http://purl.org/pav/hasVersion> <hash://md5/00e40ec6aae2408f289ef11b3d803994> <urn:uuid:14344e3c-b535-4f32-bce4-bb0ccff10bb4> .
jhpoelen commented 2 months ago

What type of further data review are you looking for?

Thanks for your thorough check on the availability of the batlit metadata.

Another thing that may be valuable to the batlit corpus is to have a peek at the test records as seen through their derived Zenodo deposits at https://sandbox.zenodo.org/communities/batlit-review-md5-26f7ce5dd404e33c6570edd4ba250d20 . If that is too much, please do let me know. I realize that your time is precious.

n8upham commented 2 months ago

Hey @jhpoelen -- yes when I call preston cat hash://md5/26f7ce5dd404e33c6570edd4ba250d20 | grep "items[?]" | grep hasVersion | head -1 it produces <https://api.zotero.org/groups/5435545/items?start=0&limit=100> <http://purl.org/pav/hasVersion> <hash://md5/00e40ec6aae2408f289ef11b3d803994> <urn:uuid:14344e3c-b535-4f32-bce4-bb0ccff10bb4> . which seems to be the same.

But then when I call preston cat hash://md5/26f7ce5dd404e33c6570edd4ba250d20 | grep "items[?]" | grep hasVersion | preston cat | jq -c .[] | head -n1 | jq . I still get the error of zsh: no matches found: .[] So it seems that the jq -c .[] part is not finding what it is looking for -- right?

Aha! OK I tried this again with quoting the search time for jq as jq -c '.[]' and then this returned the desired output, so: preston cat hash://md5/26f7ce5dd404e33c6570edd4ba250d20 | grep "items[?]" | grep hasVersion | preston cat | jq -c '.[]' | head -n1 | jq



``` {
  "key": "NYT92CCF",
  "version": 48777,
  "library": {
    "type": "group",
    "id": 5435545,
    "name": "Bat Literature Project",
    "links": {
      "alternate": {
        "href": "https://www.zotero.org/groups/bat_literature_project",
        "type": "text/html"
      }
    }
  },
  "links": {
    "self": {
      "href": "https://api.zotero.org/groups/5435545/items/NYT92CCF",
      "type": "application/json"
    },
    "alternate": {
      "href": "https://www.zotero.org/groups/bat_literature_project/items/NYT92CCF",
      "type": "text/html"
    },
    "attachment": {
      "href": "https://api.zotero.org/groups/5435545/items/3MVRRMR8",
      "type": "application/json",
      "attachmentType": "application/pdf",
      "attachmentSize": 388576
    }
  },
  "meta": {
    "createdByUser": {
      "id": 13229919,
      "username": "acsherman",
      "name": "",
      "links": {
        "alternate": {
          "href": "https://www.zotero.org/acsherman",
          "type": "text/html"
        }
      }
    },
    "creatorSummary": "Thong et al.",
    "parsedDate": "2010-10-14",
    "numChildren": 1
  },
  "data": {
    "key": "NYT92CCF",
    "version": 48777,
    "itemType": "journalArticle",
    "title": "Further records of Murina tiensa from Vietnam with first information on its echolocation calls.",
    "creators": [
      {
        "creatorType": "author",
        "firstName": "Vu Dinh",
        "lastName": "Thong"
      },
      {
        "creatorType": "author",
        "firstName": "Christian",
        "lastName": "Dietz"
      },
      {
        "creatorType": "author",
        "firstName": "Annette",
        "lastName": "Denzinger"
      },
      {
        "creatorType": "author",
        "firstName": "Paul J. J.",
        "lastName": "Bates"
      },
      {
        "creatorType": "author",
        "firstName": "Neil M.",
        "lastName": "Furey"
      },
      {
        "creatorType": "author",
        "firstName": "Gabor",
        "lastName": "Csorba"
      },
      {
        "creatorType": "author",
        "firstName": "Glenn",
        "lastName": "Hoye"
      },
      {
        "creatorType": "author",
        "firstName": "Le Dinh",
        "lastName": "Thuy"
      },
      {
        "creatorType": "author",
        "firstName": "Hans-Ulrich",
        "lastName": "Schnitzler"
      }
    ],
    "abstractNote": "The fairy tube-nosed bat, Murina tiensa, is considered to be endemic to Vietnam. It is known only from the original description, when it was found at two localities in limestone karst areas. In 2008, we conducted a series of intensive field surveys throughout the country and obtained additional records of this species from various habitats, including degraded to nearly pristine forests and an offshore island. Our results indicate that M. tiensa is a sexually dimorphic species, females being considerably larger than males in all external and craniodental measurements. The species emits broadband, downward frequency-modulated echolocation calls with a dominant first harmonic. When handheld or when flying in a flight tent, signals had a similar structure and were emitted in groups of 2–4 signals. On average, signals swept from 150 to 49 kHz in 2.2 ms for handheld bats, and, from 145 to 50 kHz in 1.9 ms for flying bats. M. tiensa often occurred in sympatry with M. cyclotis and several rhinolophids.",
    "publicationTitle": "Hystrix, the Italian Journal of Mammalogy",
    "volume": "22",
    "issue": "1",
    "pages": "",
    "date": "October 14, 2010",
    "series": "",
    "seriesTitle": "",
    "seriesText": "",
    "journalAbbreviation": "",
    "language": "en",
    "DOI": "10.4404/hystrix-22.1-4533",
    "ISSN": "18255272, 03941914",
    "shortTitle": "",
    "url": "https://doi.org/10.4404/hystrix-22.1-4533",
    "accessDate": "2024-06-28T00:31:16Z",
    "archive": "",
    "archiveLocation": "",
    "libraryCatalog": "DOI.org (CSL JSON)",
    "callNumber": "",
    "rights": "",
    "extra": "",
    "tags": [],
    "collections": [
      "UAWY6DNP"
    ],
    "relations": {
      "dc:replaces": "http://zotero.org/groups/5435545/items/NF6R8YCX"
    },
    "dateAdded": "2024-07-08T02:34:45Z",
    "dateModified": "2024-08-16T13:50:15Z"
  }
}
jhpoelen commented 2 months ago

@n8upham Yay! Thanks for trying this out. I'll make sure to add the quotes in the method section of https://batlit.org . Thanks for being creative in dealing with this.

n8upham commented 2 months ago

For sure, yeah it was the help file for man jq that indicated that Unix shells: jq Β΄.["foo"]Β΄ So I thought to try the quoting -- annoying that different shells treat this differently.

jhpoelen commented 2 months ago

I've updated the methods section, can you please confirm that the command now works through "copy-paste" ?

n8upham commented 2 months ago

What type of further data review are you looking for?

Thanks for your thorough check on the availability of the batlit metadata.

Another thing that may be valuable to the batlit corpus is to have a peek at the test records as seen through their derived Zenodo deposits at https://sandbox.zenodo.org/communities/batlit-review-md5-26f7ce5dd404e33c6570edd4ba250d20 . If that is too much, please do let me know. I realize that your time is precious.

The test records look good -- I just noticed that there is some unevenness in how the taxonomy metadata ("Biodiversity section") is annotated so far, but that is likely something that we continue to amend / build, e.g. https://sandbox.zenodo.org/records/102854 vs. https://sandbox.zenodo.org/records/101967 and https://sandbox.zenodo.org/records/101965

n8upham commented 2 months ago

I've updated the methods section, can you please confirm that the command now works through "copy-paste" ?

Yes the command does work via copy/paste now -- but it depends on me having already run the command using the --remote https://linker.bio flag on both of the preston cat calls, and that I'm in the appropriate directory to be able to find those downloaded files. So I'd suggest further documentation of that process

jhpoelen commented 2 months ago

I've updated the examples in https://batlit.org to include your suggestions. Please let me know if there's anything else that needs updating to reproduce the examples.

n8upham commented 2 months ago

Awesome, I'd say it's good to go now. Only thing -- for total newbs, it won't be obvious how to install preston or jq -- but I can understand that you also want to keep the help documentation for those utilities separate to ease future maintenance. But new students getting on board are likely to hit an initial wall there

jhpoelen commented 2 months ago

@n8upham thanks for your feedback.

I've added the following "box" to the https://batlit.org description -

πŸ’‘ In the following sections, some examples are listed that uses a notation commonly used in the Unix shell, also known as the β€œcommandline” or β€œterminal”. And, at the time of writing, these examples can be executed/run provided the following programs are available: preston, jq as well as more commonly available unix/posix/linux programs like grep, sort, and uniq. To run these programs, please use some Linux distribution, MacOS, or Windows Subsystem for Linux (WSL) available on Windows 10 and higher. These tools are powerful tools that are able to process lots of data very quickly and have the ability to run offline. If you have unfamiliar with these tools, you may benefit from them by going through a Carpentries Lesson like https://librarycarpentry.org/lc-shell/ or many of the other educational materials. Note that some of these tools have been around since the 1970s and are likely to stick around a little while longer.

Please feel free to edit or suggest changes via https://github.com/bat-literature/bat-literature.github.io/blob/main/README.md .

ariadnamorales commented 2 months ago

I installed preston and jq and tried:

preston cat --remote https://linker.bio hash://md5/26f7ce5dd404e33c6570edd4ba250d20 | grep "items[?]" | grep hasVersion | preston cat --remote https://linker.bio | jq -c .[] | head -n1 | jq .

I get the error:

zsh: no matches found: .[]

nevertheless I can start download data

[https://linker.bio/hash:...404e33c6570edd4ba250d20] 8 MB at 5.45 MB/sB/s [https://linker.bio/hash:...404e33c6570edd4ba250d20] 9 MB at 5.65 MB/s [https://linker.bio/hash:...404e33c6570edd4ba250d20] 11 MB at 5.81 MB/s [https://linker.bio/hash:...404e33c6570edd4ba250d20] 31 MB at 6.46 MB/s ...

I also tried with the quotes, as Nathan mentioned:

preston cat hash://md5/26f7ce5dd404e33c6570edd4ba250d20 | grep "items[?]" | grep hasVersion | preston cat | jq -c '.[]' | head -n1 | jq

but I get the lengthy error:

java.io.IOException: problem retrieving [hash://md5/26f7ce5dd404e33c6570edd4ba250d20] at bio.guoda.preston.cmd.ContentQueryUtil.getContent(ContentQueryUtil.java:59) at bio.guoda.preston.cmd.ContentQueryUtil.copyContent(ContentQueryUtil.java:33) at bio.guoda.preston.cmd.CmdGet.run(CmdGet.java:69) at bio.guoda.preston.cmd.CmdGet.run(CmdGet.java:49) at bio.guoda.preston.cmd.CmdGet.run(CmdGet.java:44) at picocli.CommandLine.executeUserObject(CommandLine.java:1939) at picocli.CommandLine.access$1300(CommandLine.java:145) at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2358) at picocli.CommandLine$RunLast.handle(CommandLine.java:2352) at picocli.CommandLine$RunLast.handle(CommandLine.java:2314) at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2179) at picocli.CommandLine$RunLast.execute(CommandLine.java:2316) at picocli.CommandLine.execute(CommandLine.java:2078) at bio.guoda.preston.Preston.run(Preston.java:103) at bio.guoda.preston.Preston.main(Preston.java:94) Caused by: bio.guoda.preston.store.DereferenceException: failed to dereference [hash://md5/26f7ce5dd404e33c6570edd4ba250d20] at bio.guoda.preston.store.AliasDereferencer.dereferenceAliasedHash(AliasDereferencer.java:94) at bio.guoda.preston.store.AliasDereferencer.get(AliasDereferencer.java:46) at bio.guoda.preston.store.AliasDereferencer.get(AliasDereferencer.java:18) at bio.guoda.preston.cmd.ContentQueryUtil.getContent(ContentQueryUtil.java:51) ... 14 more Caused by: bio.guoda.preston.store.DereferenceException: failed to dereference [hash://md5/26f7ce5dd404e33c6570edd4ba250d20] at bio.guoda.preston.store.ContentHashDereferencer.get(ContentHashDereferencer.java:25) at bio.guoda.preston.store.ContentHashDereferencer.get(ContentHashDereferencer.java:10) at bio.guoda.preston.store.AliasDereferencer.dereferenceAliasedHash(AliasDereferencer.java:92) ... 17 more Caused by: java.io.IOException: cannot find content identified by [hash://md5/26f7ce5dd404e33c6570edd4ba250d20] at bio.guoda.preston.stream.ContentStreamFactory.create(ContentStreamFactory.java:69) at bio.guoda.preston.store.ContentHashDereferencer.get(ContentHashDereferencer.java:23) ... 19 more java.lang.RuntimeException: java.io.IOException: problem retrieving [hash://md5/26f7ce5dd404e33c6570edd4ba250d20] at bio.guoda.preston.cmd.CmdGet.run(CmdGet.java:52) at bio.guoda.preston.cmd.CmdGet.run(CmdGet.java:44) at picocli.CommandLine.executeUserObject(CommandLine.java:1939) at picocli.CommandLine.access$1300(CommandLine.java:145) at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2358) at picocli.CommandLine$RunLast.handle(CommandLine.java:2352) at picocli.CommandLine$RunLast.handle(CommandLine.java:2314) at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2179) at picocli.CommandLine$RunLast.execute(CommandLine.java:2316) at picocli.CommandLine.execute(CommandLine.java:2078) at bio.guoda.preston.Preston.run(Preston.java:103) at bio.guoda.preston.Preston.main(Preston.java:94) Caused by: java.io.IOException: problem retrieving [hash://md5/26f7ce5dd404e33c6570edd4ba250d20] at bio.guoda.preston.cmd.ContentQueryUtil.getContent(ContentQueryUtil.java:59) at bio.guoda.preston.cmd.ContentQueryUtil.copyContent(ContentQueryUtil.java:33) at bio.guoda.preston.cmd.CmdGet.run(CmdGet.java:69) at bio.guoda.preston.cmd.CmdGet.run(CmdGet.java:49) ... 11 more Caused by: bio.guoda.preston.store.DereferenceException: failed to dereference [hash://md5/26f7ce5dd404e33c6570edd4ba250d20] at bio.guoda.preston.store.AliasDereferencer.dereferenceAliasedHash(AliasDereferencer.java:94) at bio.guoda.preston.store.AliasDereferencer.get(AliasDereferencer.java:46) at bio.guoda.preston.store.AliasDereferencer.get(AliasDereferencer.java:18) at bio.guoda.preston.cmd.ContentQueryUtil.getContent(ContentQueryUtil.java:51) ... 14 more Caused by: bio.guoda.preston.store.DereferenceException: failed to dereference [hash://md5/26f7ce5dd404e33c6570edd4ba250d20] at bio.guoda.preston.store.ContentHashDereferencer.get(ContentHashDereferencer.java:25) at bio.guoda.preston.store.ContentHashDereferencer.get(ContentHashDereferencer.java:10) at bio.guoda.preston.store.AliasDereferencer.dereferenceAliasedHash(AliasDereferencer.java:92) ... 17 more Caused by: java.io.IOException: cannot find content identified by [hash://md5/26f7ce5dd404e33c6570edd4ba250d20] at bio.guoda.preston.stream.ContentStreamFactory.create(ContentStreamFactory.java:69) at bio.guoda.preston.store.ContentHashDereferencer.get(ContentHashDereferencer.java:23) ... 19 more

jhpoelen commented 2 months ago

Hi @ariadnamorales thanks for trying this out, and apologies for the cryptic error messages. They are quite informative for me, but may not be as helpful for others.

Hmm. You are able to download the content, but for some reason unable to access them.

Do you have permissions to create files in the folder from which you run preston?

ariadnamorales commented 2 months ago

I tried

preston cat --remote https://linker.bio hash://md5/26f7ce5dd404e33c6570edd4ba250d20 | grep "items[?]" | grep hasVersion | head -1 [https://linker.bio/hash:...404e33c6570edd4ba250d20] 59 MB at 6.75 MB/s completed in < 1 minute https://api.zotero.org/groups/5435545/items?start=0&limit=100 http://purl.org/pav/hasVersion hash://md5/00e40ec6aae2408f289ef11b3d803994 .

and I get

[https://linker.bio/hash:...404e33c6570edd4ba250d20] 59 MB at 6.75 MB/s completed in < 1 minute https://api.zotero.org/groups/5435545/items?start=0&limit=100 http://purl.org/pav/hasVersion hash://md5/00e40ec6aae2408f289ef11b3d803994 .

so, adding "--remote https://linker.bio" works, but then I tried:

preston cat --remote https://linker.bio hash://md5/26f7ce5dd404e33c6570edd4ba250d20 | grep "items[?]" | grep hasVersion | preston cat | jq -c .[] | head -n1 | jq

with ou without quotes in .[], still get the error:

zsh: no matches found: .[] java.io.IOException: problem retrieving [hash://md5/00e40ec6aae2408f289ef11b3d803994] at bio.guoda.preston.cmd.ContentQueryUtil.getContent(ContentQueryUtil.java:59) at bio.guoda.preston.cmd.ContentQueryUtil.copyContent(ContentQueryUtil.java:33) at bio.guoda.preston.cmd.ContentQueryUtil.copyMostRecentContent(ContentQueryUtil.java:22) at bio.guoda.preston.cmd.CmdGet.run(CmdGet.java:64) at bio.guoda.preston.cmd.CmdGet.run(CmdGet.java:49) at bio.guoda.preston.cmd.CmdGet.run(CmdGet.java:44) at picocli.CommandLine.executeUserObject(CommandLine.java:1939) at picocli.CommandLine.access$1300(CommandLine.java:145) at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2358) at picocli.CommandLine$RunLast.handle(CommandLine.java:2352) at picocli.CommandLine$RunLast.handle(CommandLine.java:2314) at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2179) at picocli.CommandLine$RunLast.execute(CommandLine.java:2316) at picocli.CommandLine.execute(CommandLine.java:2078) at bio.guoda.preston.Preston.run(Preston.java:103) at bio.guoda.preston.Preston.main(Preston.java:94) Caused by: bio.guoda.preston.store.DereferenceException: failed to dereference [hash://md5/00e40ec6aae2408f289ef11b3d803994] at bio.guoda.preston.store.AliasDereferencer.dereferenceAliasedHash(AliasDereferencer.java:94) at bio.guoda.preston.store.AliasDereferencer.get(AliasDereferencer.java:46) at bio.guoda.preston.store.AliasDereferencer.get(AliasDereferencer.java:18) at bio.guoda.preston.cmd.ContentQueryUtil.getContent(ContentQueryUtil.java:51) ... 15 more Caused by: bio.guoda.preston.store.DereferenceException: failed to dereference [hash://md5/00e40ec6aae2408f289ef11b3d803994] at bio.guoda.preston.store.ContentHashDereferencer.get(ContentHashDereferencer.java:25) at bio.guoda.preston.store.ContentHashDereferencer.get(ContentHashDereferencer.java:10) at bio.guoda.preston.store.AliasDereferencer.dereferenceAliasedHash(AliasDereferencer.java:92) ... 18 more Caused by: java.io.IOException: cannot find content identified by [hash://md5/00e40ec6aae2408f289ef11b3d803994] at bio.guoda.preston.stream.ContentStreamFactory.create(ContentStreamFactory.java:69) at bio.guoda.preston.store.ContentHashDereferencer.get(ContentHashDereferencer.java:23) ... 20 more java.lang.RuntimeException: java.io.IOException: problem retrieving [hash://md5/00e40ec6aae2408f289ef11b3d803994] at bio.guoda.preston.cmd.CmdGet.run(CmdGet.java:52) at bio.guoda.preston.cmd.CmdGet.run(CmdGet.java:44) at picocli.CommandLine.executeUserObject(CommandLine.java:1939) at picocli.CommandLine.access$1300(CommandLine.java:145) at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2358) at picocli.CommandLine$RunLast.handle(CommandLine.java:2352) at picocli.CommandLine$RunLast.handle(CommandLine.java:2314) at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2179) at picocli.CommandLine$RunLast.execute(CommandLine.java:2316) at picocli.CommandLine.execute(CommandLine.java:2078) at bio.guoda.preston.Preston.run(Preston.java:103) at bio.guoda.preston.Preston.main(Preston.java:94) Caused by: java.io.IOException: problem retrieving [hash://md5/00e40ec6aae2408f289ef11b3d803994] at bio.guoda.preston.cmd.ContentQueryUtil.getContent(ContentQueryUtil.java:59) at bio.guoda.preston.cmd.ContentQueryUtil.copyContent(ContentQueryUtil.java:33) at bio.guoda.preston.cmd.ContentQueryUtil.copyMostRecentContent(ContentQueryUtil.java:22) at bio.guoda.preston.cmd.CmdGet.run(CmdGet.java:64) at bio.guoda.preston.cmd.CmdGet.run(CmdGet.java:49) ... 11 more Caused by: bio.guoda.preston.store.DereferenceException: failed to dereference [hash://md5/00e40ec6aae2408f289ef11b3d803994] at bio.guoda.preston.store.AliasDereferencer.dereferenceAliasedHash(AliasDereferencer.java:94) at bio.guoda.preston.store.AliasDereferencer.get(AliasDereferencer.java:46) at bio.guoda.preston.store.AliasDereferencer.get(AliasDereferencer.java:18) at bio.guoda.preston.cmd.ContentQueryUtil.getContent(ContentQueryUtil.java:51) ... 15 more Caused by: bio.guoda.preston.store.DereferenceException: failed to dereference [hash://md5/00e40ec6aae2408f289ef11b3d803994] at bio.guoda.preston.store.ContentHashDereferencer.get(ContentHashDereferencer.java:25) at bio.guoda.preston.store.ContentHashDereferencer.get(ContentHashDereferencer.java:10) at bio.guoda.preston.store.AliasDereferencer.dereferenceAliasedHash(AliasDereferencer.java:92) ... 18 more Caused by: java.io.IOException: cannot find content identified by [hash://md5/00e40ec6aae2408f289ef11b3d803994] at bio.guoda.preston.stream.ContentStreamFactory.create(ContentStreamFactory.java:69) at bio.guoda.preston.store.ContentHashDereferencer.get(ContentHashDereferencer.java:23) ... 20 more

ariadnamorales commented 2 months ago

yes, I have permits, is my private laptop (Mac sonoma 14.5)

jhpoelen commented 2 months ago

@ariadnamorales thanks for sharing. Your example shows that you have permission to save content locally.

Did you try and copy-paste the associated example from https://batlit.org? It appears that your second preston cat does not include the --remote https://linker.bio option.

You may have to refresh the webpage to get the most recent one. Alternatively, you can visit the README.md of the https://github.com/bat-literature/bat-literature.github.io and get the code from there.

ariadnamorales commented 2 months ago

looks like it is indeed downloading data:

preston cat --remote https://linker.bio hash://md5/26f7ce5dd404e33c6570edd4ba250d20 | grep "items[?]" | grep hasVersion | preston cat --remote https://linker.bio | jq -c .[] | head -n1 | jq .

zsh: no matches found: .[] [https://linker.bio/hash:...ae2408f289ef11b3d803994] 552 kB at 1.61 MB/s completed in < 1 minute [https://linker.bio/hash:...411e22d75d3eb10fff3e318] 451 kB at 1.38 MB/s completed in < 1 minute [https://linker.bio/hash:...a5576d75db8b8bb7b06fda4] 451 kB at 1.27 MB/s completed in < 1 minute [https://linker.bio/hash:...8d880ac029afad7597660f9] 453 kB at 1.28 MB/s completed in < 1 minute [https://linker.bio/hash:...b6933a856b8d42a25657106] 477 kB at 1.38 MB/s completed in < 1 minute [https://linker.bio/hash:...9474d1ef43b7e2329d06929] 437 kB at 1.55 MB/s completed in < 1 minute [https://linker.bio/hash:...1a9af4ce7d03b6dd6432062] 474 kB at 1.24 MB/s completed in < 1 minute [https://linker.bio/hash:...d42bed83e3748eb393d13e7] 463 kB at 1.28 MB/s completed in < 1 minute [https://linker.bio/hash:...a4796f02ccaed7243d03bd2] 484 kB at 1.73 MB/s completed in < 1 minute [https://linker.bio/hash:...f08fb8e31483a03dd86fbfa] 472 kB at 1.55 MB/s completed in < 1 minute [https://linker.bio/hash:...9bcb54b509b4dffc66b5282] 474 kB at 1.20 MB/s completed in < 1 minute [https://linker.bio/hash:...dc998a01d76a7bd067c054b] 466 kB at 1.50 MB/s completed in < 1 minute [https://linker.bio/hash:...7ce2c03bb9e0dc375b378e1] 461 kB at 1.41 MB/s completed in < 1 minute [https://linker.bio/hash:...09ed8ebd6e54b37ce3174e5] 459 kB at 1.50 MB/s completed in < 1 minute ^C amorales@Ariadnas-MacBook-Pro-2 batLit_rev % ls data tmp amorales@Ariadnas-MacBook-Pro-2 batLit_rev % tree data data β”œβ”€β”€ 00 β”‚Β Β  └── e4 β”‚Β Β  └── 00e40ec6aae2408f289ef11b3d803994 β”œβ”€β”€ 1e β”‚Β Β  └── ea β”‚Β Β  └── 1eeae4a50b6933a856b8d42a25657106 β”œβ”€β”€ 26 β”‚Β Β  └── f7 β”‚Β Β  └── 26f7ce5dd404e33c6570edd4ba250d20 β”œβ”€β”€ 28 β”‚Β Β  └── 92 β”‚Β Β  └── 28928b5499474d1ef43b7e2329d06929 β”œβ”€β”€ 44 β”‚Β Β  └── 92 β”‚Β Β  └── 44920a3afd42bed83e3748eb393d13e7 β”œβ”€β”€ 4f β”‚Β Β  └── 18 β”‚Β Β  └── 4f18e85729bcb54b509b4dffc66b5282 β”œβ”€β”€ 62 β”‚Β Β  └── 5f β”‚Β Β  └── 625f55ea0a5576d75db8b8bb7b06fda4 β”œβ”€β”€ 65 β”‚Β Β  └── 11 β”‚Β Β  └── 651198dd2dc998a01d76a7bd067c054b β”œβ”€β”€ a5 β”‚Β Β  └── 14 β”‚Β Β  └── a514225d9f08fb8e31483a03dd86fbfa β”œβ”€β”€ c6 β”‚Β Β  └── 5d β”‚Β Β  └── c65d7b49f411e22d75d3eb10fff3e318 β”œβ”€β”€ dd β”‚Β Β  └── c1 β”‚Β Β  └── ddc14113109ed8ebd6e54b37ce3174e5 β”œβ”€β”€ e0 β”‚Β Β  └── 82 β”‚Β Β  └── e082fef387ce2c03bb9e0dc375b378e1 β”œβ”€β”€ ee β”‚Β Β  └── 39 β”‚Β Β  └── ee39ddd1e8d880ac029afad7597660f9 β”œβ”€β”€ f4 β”‚Β Β  └── 8d β”‚Β Β  └── f48dfa1921a9af4ce7d03b6dd6432062 └── f9 └── f2 └── f9f279316a4796f02ccaed7243d03bd2

31 directories, 15 files

jhpoelen commented 2 months ago

Looks like you might want to quote (single quotes) the jq command . . . jq '.[]' as in:

preston cat --remote https://linker.bio/ hash://md5/26f7ce5dd404e33c6570edd4ba250d20\
 | grep "items[?]"\
 | grep hasVersion\
 | preston cat --remote https://linker.bio/\
 | jq -c '.[]'\
 | head -n1\
 | jq .

Apologies for these tweaks . . . curious to hear whether that works better for you now.

ariadnamorales commented 2 months ago

OK, sorry, I was just following the discussion above. I wet to the README and ran:

preston cat --remote https://linker.bio hash://md5/26f7ce5dd404e33c6570edd4ba250d20 | \
    grep "items[?]" | \
    grep hasVersion | \
    preston cat --remote https://linker.bio | \
    jq -c '.[]' | \
    head -n1 | \
    jq .

And seem to be running without a problem:

{
  "key": "NYT92CCF",
  "version": 48777,
  "library": {
    "type": "group",
    "id": 5435545,
    "name": "Bat Literature Project",
    "links": {
      "alternate": {
        "href": "https://www.zotero.org/groups/bat_literature_project",
        "type": "text/html"
      }
    }
  },
  "links": {
    "self": {
      "href": "https://api.zotero.org/groups/5435545/items/NYT92CCF",
      "type": "application/json"
    },
    "alternate": {
      "href": "https://www.zotero.org/groups/bat_literature_project/items/NYT92CCF",
      "type": "text/html"
    },
    "attachment": {
      "href": "https://api.zotero.org/groups/5435545/items/3MVRRMR8",
      "type": "application/json",
      "attachmentType": "application/pdf",
      "attachmentSize": 388576
    }
  },
  "meta": {
    "createdByUser": {
      "id": 13229919,
      "username": "acsherman",
      "name": "",
      "links": {
        "alternate": {
          "href": "https://www.zotero.org/acsherman",
          "type": "text/html"
        }
      }
    },
    "creatorSummary": "Thong et al.",
    "parsedDate": "2010-10-14",
    "numChildren": 1
  },
  "data": {
    "key": "NYT92CCF",
    "version": 48777,
    "itemType": "journalArticle",
    "title": "Further records of Murina tiensa from Vietnam with first information on its echolocation calls.",
    "creators": [
      {
        "creatorType": "author",
        "firstName": "Vu Dinh",
        "lastName": "Thong"
      },
      {
        "creatorType": "author",
        "firstName": "Christian",
        "lastName": "Dietz"
      },
      {
        "creatorType": "author",
        "firstName": "Annette",
        "lastName": "Denzinger"
      },
      {
        "creatorType": "author",
        "firstName": "Paul J. J.",
        "lastName": "Bates"
      },
      {
        "creatorType": "author",
        "firstName": "Neil M.",
        "lastName": "Furey"
      },
      {
        "creatorType": "author",
        "firstName": "Gabor",
        "lastName": "Csorba"
      },
      {
        "creatorType": "author",
        "firstName": "Glenn",
        "lastName": "Hoye"
      },
      {
        "creatorType": "author",
        "firstName": "Le Dinh",
        "lastName": "Thuy"
      },
      {
        "creatorType": "author",
        "firstName": "Hans-Ulrich",
        "lastName": "Schnitzler"
      }
    ],
    "abstractNote": "The fairy tube-nosed bat, Murina tiensa, is considered to be endemic to Vietnam. It is known only from the original description, when it was found at two localities in limestone karst areas. In 2008, we conducted a series of intensive field surveys throughout the country and obtained additional records of this species from various habitats, including degraded to nearly pristine forests and an offshore island. Our results indicate that M. tiensa is a sexually dimorphic species, females being considerably larger than males in all external and craniodental measurements. The species emits broadband, downward frequency-modulated echolocation calls with a dominant first harmonic. When handheld or when flying in a flight tent, signals had a similar structure and were emitted in groups of 2–4 signals. On average, signals swept from 150 to 49 kHz in 2.2 ms for handheld bats, and, from 145 to 50 kHz in 1.9 ms for flying bats. M. tiensa often occurred in sympatry with M. cyclotis and several rhinolophids.",
    "publicationTitle": "Hystrix, the Italian Journal of Mammalogy",
    "volume": "22",
    "issue": "1",
    "pages": "",
    "date": "October 14, 2010",
    "series": "",
    "seriesTitle": "",
    "seriesText": "",
    "journalAbbreviation": "",
    "language": "en",
    "DOI": "10.4404/hystrix-22.1-4533",
    "ISSN": "18255272, 03941914",
    "shortTitle": "",
    "url": "https://doi.org/10.4404/hystrix-22.1-4533",
    "accessDate": "2024-06-28T00:31:16Z",
    "archive": "",
    "archiveLocation": "",
    "libraryCatalog": "DOI.org (CSL JSON)",
    "callNumber": "",
    "rights": "",
    "extra": "",
    "tags": [],
    "collections": [
      "UAWY6DNP"
    ],
    "relations": {
      "dc:replaces": "http://zotero.org/groups/5435545/items/NF6R8YCX"
    },
    "dateAdded": "2024-07-08T02:34:45Z",
    "dateModified": "2024-08-16T13:50:15Z"
  }
}
[https://linker.bio/hash:...75dc340b2c951ef0fb231e0] 423 kB at 1.60 MB/s completed in < 1 minute
[https://linker.bio/hash:...18489b9946479b5bcb86f63] 465 kB at 1.35 MB/s completed in < 1 minute
[https://linker.bio/hash:...75d8d648bae36cdf037f886] 465 kB at 1.21 MB/s completed in < 1 minute
[https://linker.bio/hash:...66badeb3b2544aedb1f97e4] 417 kB at 1.49 MB/s completed in < 1 minute
[https://linker.bio/hash:...e33ffb6e243468680fd15f0] 425 kB at 1.79 MB/s completed in < 1 minute
[https://linker.bio/hash:...a06d86af6137a40f9c4ed4a] 469 kB at 1.63 MB/s completed in < 1 minute
[https://linker.bio/hash:...b13ca488fb57c2ee6c64f2e] 344 kB at 0.93 MB/s completed in < 1 minute
[https://linker.bio/hash:...911fc99665ff10ce5927a63] 309 kB at 1.15 MB/s completed in < 1 minute
[https://linker.bio/hash:...1799efa016045d440569491] 280 kB at 1.01 MB/s completed in < 1 minute
[https://linker.bio/hash:...a85298cde80837e72c1b641] 264 kB at 0.94 MB/s completed in < 1 minute
[https://linker.bio/hash:...0be3eb5d9c6a6e65c892f55] 355 kB at 1.10 MB/s completed in < 1 minute
[https://linker.bio/hash:...21f145ea2db051715b0dba3] 301 kB at 1.08 MB/s completed in < 1 minute
[https://linker.bio/hash:...f58c786d36940e6ccd1266a] 337 kB at 1.18 MB/s completed in < 1 minute
[https://linker.bio/hash:...eb5a3afc46f7533400f8639] 387 kB at 1.09 MB/s completed in < 1 minute
[https://linker.bio/hash:...5a3085b64b172d89bad83e0] 247 kB at 1.20 MB/s completed in < 1 minute
[https://linker.bio/hash:...e5a06e372d07a7c102158e3] 341 kB at 1.06 MB/s completed in < 1 minute
[https://linker.bio/hash:...f71c391e934f08c7cdd8fef] 305 kB at 1.12 MB/s completed in < 1 minute
[https://linker.bio/hash:...55ebd215f386eae7ab04082] 297 kB at 1.08 MB/s completed in < 1 minute
[https://linker.bio/hash:...c7383824a47b2446d1b5293] 310 kB at 1.12 MB/s completed in < 1 minute
[https://linker.bio/hash:...c3df654b915ff7f47c9a7e1] 309 kB at 1.14 MB/s completed in < 1 minute
[https://linker.bio/hash:...d715cbabb402592361616e4] 308 kB at 1.12 MB/s completed in < 1 minute
[https://linker.bio/hash:...c35845e21bb654820c7a415] 328 kB at 1.18 MB/s completed in < 1 minute
[https://linker.bio/hash:...07635e3eb644e89608a98b9] 333 kB at 1.52 MB/s completed in < 1 minute
[https://linker.bio/hash:...320d5dec0c9e987bfc57517] 294 kB at 1.12 MB/s completed in < 1 minute
[https://linker.bio/hash:...9679c42989e0d1c0aa49a5b] 283 kB at 1.03 MB/s completed in < 1 minute
[https://linker.bio/hash:...14a5787b9e834b637007f2e] 325 kB at 1.14 MB/s completed in < 1 minute
[https://linker.bio/hash:...36ae8bf631159f235d2e6f8] 321 kB at 1.17 MB/s completed in < 1 minute
[https://linker.bio/hash:...84d98a54c1690aa75f8fe4a] 271 kB at 0.96 MB/s completed in < 1 minute
[https://linker.bio/hash:...5517ed500196533da046bc0] 278 kB at 1.08 MB/s completed in < 1 minute
[https://linker.bio/hash:...e0bbc6bf9e9d45a7993ed3c] 312 kB at 1.15 MB/s completed in < 1 minute
[https://linker.bio/hash:...c31b024d03d4d971ae6a294] 338 kB at 1.19 MB/s completed in < 1 minute
[https://linker.bio/hash:...c0aeed571bfa1629419f4a6] 296 kB at 1.09 MB/s completed in < 1 minute
[https://linker.bio/hash:...e44006e29a524da7f2ac780] 287 kB at 1.05 MB/s completed in < 1 minute
[https://linker.bio/hash:...183bb1a0527e0a7418ed88b] 309 kB at 1.28 MB/s completed in < 1 minute
[https://linker.bio/hash:...8b50a4c99e3131d7a12ffb2] 288 kB at 0.94 MB/s completed in < 1 minute
[https://linker.bio/hash:...3142fd7301b7df3787d987f] 338 kB at 1.25 MB/s completed in < 1 minute
[https://linker.bio/hash:...9a7e1f74ba30b331acda2b7] 319 kB at 1.19 MB/s completed in < 1 minute
[https://linker.bio/hash:...4b67e315096efb3514f915c] 308 kB at 1.00 MB/s completed in < 1 minute
[https://linker.bio/hash:...e4febd1540c7a80f442de1f] 277 kB at 0.98 MB/s completed in < 1 minute
[https://linker.bio/hash:...10ead25896302479943a563] 277 kB at 1.08 MB/s completed in < 1 minute
[https://linker.bio/hash:...6b598329a344226c718e7d3] 270 kB at 0.95 MB/s completed in < 1 minute
[https://linker.bio/hash:...cb801def8cc43a38f911e6e] 287 kB at 1.05 MB/s completed in < 1 minute
[https://linker.bio/hash:...47956715168591969976177] 259 kB at 0.92 MB/s completed in < 1 minute
[https://linker.bio/hash:...ccab4c249f83391f433b9c5] 309 kB at 1.09 MB/s completed in < 1 minute
[https://linker.bio/hash:...620d7629fef023d31cfbae1] 318 kB at 1.13 MB/s completed in < 1 minute
[https://linker.bio/hash:...d2da567c661b10c05f1f81b] 318 kB at 1.56 MB/s completed in < 1 minute
[https://linker.bio/hash:...b48743a774635dbbb6cab95] 332 kB at 1.19 MB/s completed in < 1 minute
[https://linker.bio/hash:...e4fa0efdaf1fcb769e2279b] 308 kB at 1.12 MB/s completed in < 1 minute
[https://linker.bio/hash:...784c46acc3bdde6b0f35262] 310 kB at 1.10 MB/s completed in < 1 minute
[https://linker.bio/hash:...f49bed32cfd9461ba7eb777] 332 kB at 1.06 MB/s completed in < 1 minute
[https://linker.bio/hash:...94bf54395d19aa9425fc415] 311 kB at 1.20 MB/s completed in < 1 minute
[https://linker.bio/hash:...75d5fa592bc746bc52ae651] 328 kB at 1.18 MB/s completed in < 1 minute
[https://linker.bio/hash:...fdac46fe243be9bc89aed2a] 337 kB at 1.19 MB/s completed in < 1 minute
[https://linker.bio/hash:...76640277f965d3f78b473c4] 252 kB at 0.93 MB/s completed in < 1 minute
[https://linker.bio/hash:...339b9c686a95d444d2fc42a] 269 kB at 1.00 MB/s completed in < 1 minute
[https://linker.bio/hash:...0c0012063109c5981b8a7c4] 277 kB at 1.05 MB/s completed in < 1 minute
[https://linker.bio/hash:...099a856bd856636d383b4b2] 285 kB at 1.07 MB/s completed in < 1 minute
[https://linker.bio/hash:...67e9f3b6797b75c32465b95] 280 kB at 1.22 MB/s completed in < 1 minute
[https://linker.bio/hash:...133136cbef7f44920afb748] 272 kB at 0.99 MB/s completed in < 1 minute
[https://linker.bio/hash:...bf6bc18b7a930bdccd3f846] 329 kB at 1.17 MB/s completed in < 1 minute
[https://linker.bio/hash:...05b46ededda6b70b6e97763] 313 kB at 1.13 MB/s completed in < 1 minute
[https://linker.bio/hash:...cb3d26a5bb8c884058c2e4c] 289 kB at 1.02 MB/s completed in < 1 minute
[https://linker.bio/hash:...902a15493cac81a3d331a52] 259 kB at 0.95 MB/s completed in < 1 minute
[https://linker.bio/hash:...3b5a6419abeed59bf9e4291] 251 kB at 1.05 MB/s completed in < 1 minute
[https://linker.bio/hash:...c992a33fc19a43a35d6afae] 254 kB at 0.93 MB/s completed in < 1 minute
[https://linker.bio/hash:...b24476b938b21f3725edf8c] 290 kB at 1.03 MB/s completed in < 1 minute
[https://linker.bio/hash:...cab085d11bc0c6e9afeab7a] 267 kB at 0.98 MB/s completed in < 1 minute
[https://linker.bio/hash:...79bb87f9bc3eae385d895ca] 301 kB at 1.10 MB/s completed in < 1 minute
[https://linker.bio/hash:...bed066d0824c95f47db5ac6] 288 kB at 1.01 MB/s completed in < 1 minute
[https://linker.bio/hash:...60937273b64b7c8b1dd2c9c] 305 kB at 1.23 MB/s completed in < 1 minute
[https://linker.bio/hash:...cdb9f9ce397ddebb9a5ee6e] 269 kB at 1.29 MB/s completed in < 1 minute
[https://linker.bio/hash:...dccdee1ac35949120b98f08] 281 kB at 1.12 MB/s completed in < 1 minute
[https://linker.bio/hash:...83a8b95c14c82ae9883f23d] 271 kB at 0.90 MB/s completed in < 1 minute
[https://linker.bio/hash:...d7698a052bfc6ba01bbb68f] 295 kB at 0.98 MB/s completed in < 1 minute
jhpoelen commented 2 months ago

@ariadnamorales I am glad you were able to reproduce the example that was included in https://batlit.org . And thank you for taking the time.

What would have made it easier for your past self to get to this point? Any suggestions on making it easier to run the examples?

jhpoelen commented 2 months ago

(note that after running the example successfully, you should be able to run the same example with the internet turned off).

ariadnamorales commented 2 months ago

well I had not installed ´´´preston´´´and ´´´jq´´´. I had to install them, new user might face the same issue. Having a "prerequisites" section will help. Other than that, the example code is very clear. However the output folders are a bit cryptic. Not sure how to interpret them...

ariadnamorales commented 2 months ago

also, would it be possible to save the output of jq (json format) as a table. For users it will be easier to interpret. The README mentions a table, but I do not find the tsv or csv files in the files that were downloaded.

jhpoelen commented 2 months ago

well I had not installed ´´´preston´´´and ´´´jq´´´. I had to install them, new user might face the same issue. Having a "prerequisites" section will help.

I've added a prerequisite section for your review at https://batlit.org#prerequisites .

Other than that, the example code is very clear. However the output folders are a bit cryptic. Not sure how to interpret them...

The output folders are not meant to be read by humans. Instead, they are more like the hidden .git folders that come with cloned copies of Git repositories.

The preston tool is meant to be used as a way to discover the content in that data archive using commands like preston ls, preston cat, and preston history . Because this repository is expressed in md5 space, you'd have to add --algo md5 (sha256 is the default, but md5 is friendlier for Zenodo content).

You can also run preston as a (local) server using -

preston s --algo md5

to make the content accessible via http (e.g., an internet browser) on port 8080. This is what is powering https://linker.bio and enables stuff like:

preston clone\
 --algo md5\
 --anchor hash://md5/26f7ce5dd404e33c6570edd4ba250d20\
 http://localhost:8080/

open to any suggestions. . .

jhpoelen commented 2 months ago

also, would it be possible to save the output of jq (json format) as a table. For users it will be easier to interpret. The README mentions a table, but I do not find the tsv or csv files in the files that were downloaded.

jq supports tsv / csv output as documented at https://jqlang.github.io/jq/manual/#format-strings-and-escaping .

And, I often use miller https://miller.readthedocs.io/en/6.12.0/ to do (streaming) table processing and conversions. Using miller, I created the tables included in the readme. Would you like me to link a downloadable version tsv/csv associated with the markdown table in the text?

jhpoelen commented 2 months ago

btw - if you'd like, feel free to edit the README.md as you see fit and submit a pull request . If you don't like pull requests, let me know and we'll figure something else out.

Thanks for being patient in reviewing v0.5 of batlit.