ropensci / rfishbase

R interface to the fishbase.org database
https://docs.ropensci.org/rfishbase
111 stars 40 forks source link

fb_tbl(spawnagg)? #284

Open dannymooseLFC opened 1 month ago

dannymooseLFC commented 1 month ago

The fb_table ("spawnagg") has the exact information I need for a list of grouper species. The table exists in the list, and gives me the example table when I use the code fb_tbl ("spawnagg"). However, when I use it in code alongside my species list it says that spawnagg doesn't exist as a function (only spawning appears as an option). Am I missing something here?

Any help appreciated :)

cboettig commented 1 month ago

@dannymooseLFC sorry but I don't follow your question. Perhaps you could provide a reproducible example of the code you are running?

I really don't know what you mean by "an example" table. Calling fb_tbl() always returns all available data in that table, not "an example". Perhaps you mean that the tibble display format only prints the first 10 rows and then says that it has 963 more rows?

For instance, I see this:

fb_tbl("spawnagg")
# A tibble: 973 × 16
   SpawnAggID C_Code SpecCode SynCode SpawnAggRef SpawningType AggregationType DirectSpawning                       IndirectSpawning CurrentStatus LunarPhase SpawningMonths HabitatType Management Gear  Reference
        <dbl> <chr>     <int>   <int>       <int> <chr>        <chr>           <chr>                                <chr>            <chr>         <chr>      <chr>          <chr>       <chr>      <chr> <chr>    
 1          1 044          18   22722          NA Unknown      Unknown         ""                                   High seasonal l… Unknown       ""         ""             Other       Spawning … Unkn… Carleton…
 2          2 840         318    1918          NA Unknown      Unknown         "Spawning observation"               High seasonal l… Gone          ""         ""             Unspecified Time/area… Mid-… Bailey, …
 3          3 192         154   22781          NA Unknown      Unknown         "Unspecified"                        High seasonal l… Decreasing    "Full moo… "January, Feb… Outer reef… Unknown    Unkn… Claro, R…
 4          4 583        4460   25130          NA Unknown      Transient       "Unspecified"                        High seasonal l… Unknown       "Full moo… "March, "      Outer reef… Spawning … Unkn… Rhodes K…
 5          5 583        6473   26142          NA Unknown      Transient       "Hydrated eggs"                      High seasonal l… Unknown       ""         "May, June, J… Outer reef… Unknown    Hook… Rhodes K…
 6          6 584       60479    5683          NA Unknown      Resident        "Spawning observation"               Courtship; Colo… Unknown       ""         "November, "   Reef chann… Unknown    Unkn… Colin P.…
 7          7 796          18   22722          NA Unknown      Transient       "Unspecified"                        Colour changes … Unknown       "Full moo… "February, "   Reef promo… Marine Pr… Trap  Mark Tup…
 8          8 084        1216   23652          NA Unknown      Unknown         "Unspecified"                        Colour changes … Unknown       "First qu… "January, "    Reef chann… Time/area… Hook… Salas E,…
 9          9 084          12   22703          NA Unknown      Unknown         "Spawning observation; Hydrated egg… Courtship; High… Unknown       "First qu… ""             Reef chann… Spawning … Unkn… Salas E,…
10         10 192          18   22722          NA Unknown      Unknown         "Unspecified"                        High seasonal l… Unknown       ""         ""             Other       Unknown    Unkn… Sadovy Y…
# ℹ 963 more rows
# ℹ Use `print(n = ...)` to see more rows
dannymooseLFC commented 4 weeks ago

@cboettig Apologies, I would say I'm still a beginner with R. I will try to explain my issue a bit more clearly.

First of all, I set a value using the code Grouper <- species_list(Family = "Epinephelidae"). Which returns a chr of 170 species latin names. Then, what I wish to do is populate the spawnagg table with info on these species, using the code: Species_Spawnagg <- spawnagg(Grouper).

With this I get an error, as it says spawnagg does not exist as a function. I try the same code with "spawning" for example, and it works as intended.

Does that make more sense? Thanks in advance :)

cboettig commented 4 weeks ago

@dannymooseLFC thanks for clarifying, yes I follow now, though generally the best way to explain what you are doing is to literally paste the precise code you ran, what we call a reprex.

I recommend a workflow like this:

library(rfishbase)
library(dplyr)

## Get the whole spawning and spawn agg table, joined together:
spawn <- left_join(fb_tbl("spawning"),  fb_tbl("spawnagg"), relationship = "many-to-many")

## Get the "taxa" table, a helper fn that combines Species tables with Family, Order, and Class tables.
taxa  <- load_taxa()             

# Filter taxa down to the desired species
groupers <- taxa |> filter(Family == "Epinephelidae")

## A "filtering join" (inner join) 
spawn |> inner_join(groupers)

Lemme know if the logic there is confusing. (Note the relationship = "many-to-many" is confusing to us all! It tells us a SpecCode may appear only once in one of the tables but multiple times in the other table, and vice versa).

You tried the "old" syntax, which would look like:


Grouper <- species_list(Family = "Epinephelidae")
Species_Spawning <- spawning(Grouper)
Species_Spawnagg <- spawnagg(Grouper) # ERROR! 
left_join(Species_Spawning, Species_Spawnagg, relationship = "many-to-many")

This doesn't ever use fb_tbl()! Like the error said, there is no function called spawnagg. ls("package:rfishbase") will list all the functions. fb_tables() will list all the fishbase tables. NOTE there are more tables than functions! I couldn't keep up writing custom functions for each table, so fb_tbl() was introduced as a simple/direct way to access the tables.

Note that both workflows are essentially the same 4 lines long. The new table-based approach uses more dplyr, and is more easily adapted. The main thing to know here is about using load_taxa() to help navigate between SpecCode found in most tables and the various taxonomic names.

dannymooseLFC commented 4 weeks ago

@cboettig Thank you very much for this, and noted re reprex. |> is a piece of code I've not come across before, is it for filtering?

The new syntax looks clearer so thank you for introducing me to it.

Based on your penultimate paragraph, it appears that spawnagg is a table but not currently a function, is that correct? Meaning there isn't currently a way to produce a list of species with corresponding columns for "SpawningType", "AggregationType", "LunarPhase", "SpawningMonths" etc.?

I can appreciate the huge amount of work that goes into these packages, so I only ask so it's clear for me, and not to judge!

Many thanks

cboettig commented 4 weeks ago

|> is a "pipe" function in R (it is essentially the same as %>% if you have seen that. It just passes the first argument. x |> some_fn(y) is just another way of writing some_fn(x,y). See https://www.tidyverse.org/blog/2023/04/base-vs-magrittr-pipe/ for details.

Based on your penultimate paragraph, it appears that spawnagg is a table but not currently a function, is that correct?

Yes, but please note that there is no need for the functions like spawning() any more. fb_tbl("spawning") gives you access to all the spawing data! fb_tbl("spawnagg") gives you access to all of the spawnagg data. It already is giving you columns for "SpawningType", "AggregationType", "LunarPhase", "SpawningMonths".

fb_tbl("spawnagg")
# A tibble: 973 × 16
   SpawnAggID C_Code SpecCode SynCode SpawnAggRef SpawningType AggregationType DirectSpawning              
        <dbl> <chr>     <int>   <int>       <int> <chr>        <chr>           <chr>                       
 1          1 044          18   22722          NA Unknown      Unknown         ""                          
 2          2 840         318    1918          NA Unknown      Unknown         "Spawning observation"      
 3          3 192         154   22781          NA Unknown      Unknown         "Unspecified"               
 4          4 583        4460   25130          NA Unknown      Transient       "Unspecified"               
 5          5 583        6473   26142          NA Unknown      Transient       "Hydrated eggs"             
 6          6 584       60479    5683          NA Unknown      Resident        "Spawning observation"      
 7          7 796          18   22722          NA Unknown      Transient       "Unspecified"               
 8          8 084        1216   23652          NA Unknown      Unknown         "Unspecified"               
 9          9 084          12   22703          NA Unknown      Unknown         "Spawning observation; Hydr…
10         10 192          18   22722          NA Unknown      Unknown         "Unspecified"               
# ℹ 963 more rows
# ℹ 8 more variables: IndirectSpawning <chr>, CurrentStatus <chr>, LunarPhase <chr>, SpawningMonths <chr>,
#   HabitatType <chr>, Management <chr>, Gear <chr>, Reference <chr>
# ℹ Use `print(n = ...)` to see more rows
> 

note that all those columns you mentioned are included. For all species. The code I showed above will combine this table with the spawning table and filter out those species not in the grouper family. Remember, everything is just a table now. You can filter it by whatever you want. You can join it to another table. Does this make sense?

Thank you for your questions here! It helps me understand where we need to improve the README at very least!

dannymooseLFC commented 2 weeks ago

@cboettig Thank you :) I understand it, I just can't see the "spawnagg" columns come up in the resulting table for grouper when I follow the updated code! I will keep trying and fiddling with it. Hopefully it works.

Edit: If I only want the data from the spawnagg table and not the spawning table, what would the code look like for that?

Thank you for answering my questions! Dan :)

cboettig commented 2 weeks ago

To get only the spawnagg table:

fb_tbl("spawnagg")

(likewise for any other table, e.g. fb_tbl("spawning") or fb_tbl("species") )