BioKIC / NEON-Biorepository

Development base for the NEON Biorepository Data Portal host by BioKIC - Arizona State University (https://biorepo.neonscience.org)
GNU General Public License v2.0
2 stars 1 forks source link

Harvester updates for mosquito sample classes #441

Open sunray1 opened 3 months ago

sunray1 commented 3 months ago

For the mosquito sample classes, the harvester is failing to record data from the API, even though the information is available.

For Bulk Identified/DNA Extracts/Pathogen Extracts/Pinned Vouchers:

sunray1 commented 3 months ago

Related to #325

kyule commented 1 month ago

I did some additional digging here 5/28/24

If we were to "uncomment out" //if(strpos($tableName,'identification')) continue; we would be able to correctly harvest the collector for pinned mosquitos and DNA samples as the API is always providing "National Ecological Observatory Network..." for these samples (likely is because of how the expert identifiers are returning data). However, uncommenting out that line prevents harvesting of other fields that do rely on that table. It would be better if we could get NEON to not include that in the API.

The sorting table is needed in order to get the determination dates and references but relying on info that far up the heirarchy leads to incorrect plot, individual count, etc. Commenting out the line that causes us to skip the barcoding table. Collection date is provided in the barcoding table but not the pinned table so you need to go up to the sorting table for the pinned individual but not its child sample :/

kyule commented 1 month ago

Currently the line to skip the 'identifications' table is commented out, but the line to skip the 'barcoding' table is active. Sample classes like pinned and DNA sample, are therefore able to get the correct taxon for the sample but do not see the identifiedBy, identificationResources, and identificationDate fields because those are located in the barcoding table, etc. They get the taxon and say the determiner, etc, is unknown. Although they would be able to see this information in the parent sample, they already have an identificaiton (albeit incomplete) from the sample itself so $harvestIdentifications is false and the parent sample identification is not used.