I am using bold::bold_specimens within a custom function to return specimen records for a given taxon in "tsv" format. I am pretty sure this worked fine in the past for me, however now the resulting data.frame is not well formatted. The header line partly wraps into the first row of the data.frame, resulting in a dodgy row 1 and extra 'NA' entries in other rows that shouldn't be there.
Example:
specimen_table <- bold::bold_specimens(taxon="Anaspididae", format="tsv")
specimen_table
processid sampleid recordID catalognum fieldnum
1 image_ids image_urls copyright_licenses trace_ids trace_links
2 GBCM0002-06 AF048821 468923
3
4 GBCM0381-06 DQ310660 501348 HBLB 047 (BIO)
5
6 RBGC001-03 MaAna000 4901 MaAna000
7
institution_storing bin_uri phylum_taxID phylum_name class_taxID
1 run_dates sequencing_centers directions seq_primers marker_codes
2 Mined from GenBank, NCBI BOLD:AAF3961 20 Arthropoda 69
3
4 Mined from GenBank, NCBI BOLD:AAF3962 20 Arthropoda 69
5
6 Biodiversity Institute of Ontario BOLD:AAF3961 20 Arthropoda 69
7
class_name order_taxID order_name family_taxID family_name subfamily_taxID subfamily_name
1 NA NA NA NA
2 Malacostraca 352 Anaspidacea 1697 Anaspididae NA NA
3 NA NA NA NA
4 Malacostraca 352 Anaspidacea 1697 Anaspididae NA NA
5 NA NA NA NA
6 Malacostraca 352 Anaspidacea 1697 Anaspididae NA NA
7 NA NA NA NA
genus_taxID genus_name species_taxID species_name subspecies_taxID subspecies_name
1 NA NA NA NA
2 5694 Anaspides 8241 Anaspides tasmaniae NA NA
3 NA NA NA NA
4 5694 Anaspides 8241 Anaspides tasmaniae NA NA
5 NA NA NA NA
6 5694 Anaspides 8241 Anaspides tasmaniae NA NA
7 NA NA NA NA
identification_provided_by voucher_type tissue_type collectors collectiondate lifestage sex
1 NA NA NA NA NA NA NA
2 NA NA NA NA NA NA NA
3 NA NA NA NA NA NA NA
4 NA NA NA NA NA NA NA
5 NA NA NA NA NA NA NA
6 NA NA NA NA NA NA NA
7 NA NA NA NA NA NA NA
reproduction extrainfo notes lat lon coord_source coord_accuracy country province
1 NA NA NA NA NA NA NA NA
2 NA NA NA NA NA NA NA NA
3 NA NA NA NA NA NA NA NA
4 NA NA NA NA NA NA NA NA
5 NA NA NA NA NA NA NA NA
6 NA Anaspides tasmaniae NA NA NA NA NA NA NA
7 NA NA NA NA NA NA NA NA
region exactsite X
1 NA NA NA
2 NA NA NA
3 NA NA NA
4 NA NA NA
5 NA NA NA
6 NA NA NA
7 NA NA NA
> str(specimen_table)
'data.frame': 7 obs. of 42 variables:
$ processid : chr "image_ids" "GBCM0002-06" "" "GBCM0381-06" ...
$ sampleid : chr "image_urls" "AF048821" "" "DQ310660" ...
$ recordID : chr "copyright_licenses" "468923" "" "501348" ...
$ catalognum : chr "trace_ids" " " "" "HBLB 047 (BIO)" ...
$ fieldnum : chr "trace_links" " " "" " " ...
$ institution_storing : chr "run_dates" "Mined from GenBank, NCBI" "" "Mined from GenBank, NCBI" ...
$ bin_uri : chr "sequencing_centers" "BOLD:AAF3961" "" "BOLD:AAF3962" ...
$ phylum_taxID : chr "directions" "20" "" "20" ...
$ phylum_name : chr "seq_primers" "Arthropoda" "" "Arthropoda" ...
$ class_taxID : chr "marker_codes" "69" "" "69" ...
$ class_name : chr "" "Malacostraca" "" "Malacostraca" ...
$ order_taxID : int NA 352 NA 352 NA 352 NA
$ order_name : chr "" "Anaspidacea" "" "Anaspidacea" ...
$ family_taxID : int NA 1697 NA 1697 NA 1697 NA
$ family_name : chr "" "Anaspididae" "" "Anaspididae" ...
$ subfamily_taxID : logi NA NA NA NA NA NA ...
$ subfamily_name : logi NA NA NA NA NA NA ...
$ genus_taxID : int NA 5694 NA 5694 NA 5694 NA
$ genus_name : chr "" "Anaspides" "" "Anaspides" ...
$ species_taxID : int NA 8241 NA 8241 NA 8241 NA
$ species_name : chr "" "Anaspides tasmaniae" "" "Anaspides tasmaniae" ...
$ subspecies_taxID : logi NA NA NA NA NA NA ...
$ subspecies_name : logi NA NA NA NA NA NA ...
$ identification_provided_by: logi NA NA NA NA NA NA ...
$ voucher_type : logi NA NA NA NA NA NA ...
$ tissue_type : logi NA NA NA NA NA NA ...
$ collectors : logi NA NA NA NA NA NA ...
$ collectiondate : logi NA NA NA NA NA NA ...
$ lifestage : logi NA NA NA NA NA NA ...
$ sex : logi NA NA NA NA NA NA ...
$ reproduction : logi NA NA NA NA NA NA ...
$ extrainfo : chr "" " " "" " " ...
$ notes : logi NA NA NA NA NA NA ...
$ lat : logi NA NA NA NA NA NA ...
$ lon : logi NA NA NA NA NA NA ...
$ coord_source : logi NA NA NA NA NA NA ...
$ coord_accuracy : logi NA NA NA NA NA NA ...
$ country : logi NA NA NA NA NA NA ...
$ province : logi NA NA NA NA NA NA ...
$ region : logi NA NA NA NA NA NA ...
$ exactsite : logi NA NA NA NA NA NA ...
$ X : logi NA NA NA NA NA NA ...
Any help would be very welcome! Thanks in advance.
I am using
bold::bold_specimens
within a custom function to return specimen records for a given taxon in "tsv" format. I am pretty sure this worked fine in the past for me, however now the resulting data.frame is not well formatted. The header line partly wraps into the first row of the data.frame, resulting in a dodgy row 1 and extra 'NA' entries in other rows that shouldn't be there.Example:
Any help would be very welcome! Thanks in advance.
Session info here: