The popler folks did not get all their datasets from the repo. Many (most) were downloaded from sites' individual websites, and did not include the repository package id. These are labeled "NA" in the popler_knbid.csv file.
However, for every dataset I've looked for (approx 10, manually), a packageId exists. These are already in the list called L0_metacommunities.
possible solutions to filling in the "NAs":
A. continue manually (eew).
B. scrape URL and look for more info, eg, a DOI or packageId that was missed
C. query titles in pasta
Will start with option C - many sites now use the same title, even if they are not displaying a pasta packageID.
Popler (Aldo) is aware of this shortcoming in their process, and may come up with a way to gather DOIs instead of the url they currently use to link out to metadata (as URLs are already breaking).
The popler folks did not get all their datasets from the repo. Many (most) were downloaded from sites' individual websites, and did not include the repository package id. These are labeled "NA" in the popler_knbid.csv file.
However, for every dataset I've looked for (approx 10, manually), a packageId exists. These are already in the list called L0_metacommunities.
possible solutions to filling in the "NAs": A. continue manually (eew). B. scrape URL and look for more info, eg, a DOI or packageId that was missed C. query titles in pasta
Will start with option C - many sites now use the same title, even if they are not displaying a pasta packageID.
Popler (Aldo) is aware of this shortcoming in their process, and may come up with a way to gather DOIs instead of the url they currently use to link out to metadata (as URLs are already breaking).