Closed bthe closed 3 years ago
I'm still tempted by the idea of a stock variable being computed (somehow, syntax along the lines of stock = mfdb_combine(list(species = 'cod', age = mfdb_group(juv = 1:3, adult = 4:20)), list(species = 'had', age = mfdb_group(juv = 1:2, adult = 3:20)) )
. Making that work will be interesting, but more to the point it probably doesn't help you in this case, since I'm guessing that even if it's related to fjord, splitting up by area and trying to derive stock from area isn't the right thing to do.
But I think it's worth avoiding making a "stock" column so you don't confuse it with a derived column as above. Would a "label" column make sense? Or can you think of a better name?
The rest seems fairly straightforward.
But I think it's worth avoiding making a "stock" column so you don't confuse it with a derived column as above. Would a "label" column make sense? Or can you think of a better name?
Yes, I take your point, especially with the abuse of the term in gadget :D
We probably could get away with naming the column population instead.
More tow attributes
The intended home for these seems to have been gear, "mesh_size" is already there, and there's a TODO comment about basically adding what you're suggesting. The join happens a the sample level, i.e. for an individual sample we have a vessel/tow/gear reference, rather than saying that for each tow we use a gear.
Any objections to putting them there instead? I guess if the gear varies constantly between tows than it will get a bit boring to populate, but not the end of the world
add a trip to the tow taxonomy
Would you not potentially have multiple trips to a tow (or 0, because it's not really a relevant concept to the gear used)? A separate trip table feels more appropriate here. But I don't think there's anything contentious about that.
adding a parent_id which links to the company id within the same table.
There's a 2 layer hierarchy mechanism that's designed for cases like this (I think institution was the original use-case), so being smart is very easy :)
I'm wary about having a vessel -> owner structure, since presumably vessels will change owners. But OTOH, I guess from your perspective you don't necessarily know that vessel A (owned by company X) and vessel B (owned by company Y) are actually the same hunk of metal---or even if you did, it's not particularly interesting from your point of view.
Any objections to putting them there instead? I guess if the gear varies constantly between tows than it will get a bit boring to populate, but not the end of the world
No, not really. You would need to a whole lot more with the gear taxonomy though, as the number of hooks will vary a lot more than the mesh size between "tows". But I would always want to be able to pull out just the long-lines (i.e. gear = 'LLN') and simple ignore the number of hooks, but also be able split the tows into groups based on mesh size (for gillnets) or number of hooks.
A separate trip table feels more appropriate here.
I was thinking along those lines, you would have a trip that links together several tows. You can also have multiple gears per trip.
I'm wary about having a vessel -> owner structure, since presumably vessels will change owners.
The Icelandic vessel registry stores the history of ownership and assigns an history number whenever there is a change in vessel properties (e.g change of owner ship). When I import this into mfdb I append the history id to the vessel id so this sorts itself out. This works for now but it is of course bit messy and you loose track of the vessels. I am open to other suggestions.
I've added a "vessel_owner" taxonomy and tables, but I'm marginally worried about the overlap between this and "institute". They're nearly the same thing, but I couldn't think of a pithy term to encompass both, and wasn't sure it's worth the renaming upheaval anyway.
I must admit that I have not much used the institute lookup, but I'll make an effort this year:)
But to add to this wish list it would be useful to have depth associated with an areacell, as you will have tow depths which are not bottom trawls.
I think part of the problem with the gear vs tow is that we've been slightly too flexible with defining gear
, e.g:
> mfdb::gear[grepl('GIL', mfdb::gear$name),c('name', 'description')]
name description
73 GIL gillnets
74 GIL.GI1 gillnet, mesh 1''
75 GIL.GI2 gillnet, mesh 2''
76 GIL.GI3 gillnet, mesh 3''
77 GIL.GI4 gillnet, mesh 4''
78 GIL.GI5 gillnet, mesh 5''
79 GIL.GI6 gillnet, mesh 6''
80 GIL.GI7 gillnet, mesh 7''
81 GIL.GI8 gillnet, mesh 8''
82 GIL.GI9 gillnet, mesh 9''
83 GIL.GI10 gillnet, mesh 10''
84 GIL.GI11 gillnet, mesh 11''
85 GIL.GI12 gillnet, mesh 12''
Selecting a range of GIL.GI(x)
is obviously annoying (even if they weren't in inches). And like you say, assigning a name to every possible combination of gear attributes is a bit silly. So I think adding to tow is correct. Ideally the gear type would be another attribute of the tow table, and GIL.GI(x)
wouldn't exist as an option.
But what do you do if you have no tow information, but gear information? I don't like having 2 sets of columns, one in gear one in tow and you knowing whether to use "tow_mesh_size" or "gear_mesh_size". But having done several mental laps around the problem, I don't think there's a better solution.
I don't have better solution, and seems to be a good compromise. If you would force people to add tows to the data you would probably cause a lot of grief. But on the other hand if you want to go into further details (mesh size, number of hooks etc.) I would expect you would want record more with the tow than just the gear type.
Right, my action is finally working. Have a look at: https://github.com/gadget-framework/mfdb/blob/wip-7.x/.github/workflows/gh-pages.yml
mf
database (and gelda
, but obviously that's not interesting for anything beyond the examples)This should be all you need to get something that uses MFDB going. As for the output of this action, unfortunately the proxy doesn't let all the assets download, but you can have a peek here:
Once wip-7.x is merged it'll look much nicer.
Right, the tow demo, already updated for the extra fields, has now been converted into a vignette:
...which should show how to use the new fields (although you could have probably guessed), and will be a lot more accessible than the old demo scripts. It still functions as a test, although I cheated and have hidden test comparison chunks after each query.
There are a couple of things missing from mfdb that would expand its potential use. As @lentinj suggested earlier this week it is best to get started with listing these items. Starting with additional data types:
[x] Stock e.g. here in Iceland we have multiple shrimp stock (each fjord is a stock) so identifying data (samples, landings etc) would be useful
[x] More biological attributes such liver_weight, gonad_weight, stomach_weight, gutted_weight which can be linked with consumption calculations and maturity in gadget.
[x] More tow attributes, atm the tow taxonomy table is geared towards bottom trawls. We need to account for long lines with number_of_hooks and bait_type, and for gillnets the number_of_nets, net_type and mesh_size would be useful. The bait and net types would be linked to a lookup table (name and description).
[x] I also think that some work towards the economical side would be very useful. At least I think it would be worthwhile to add a trip to the tow taxonomy, where the trip id would point to a table where the following columns are stored:
[x] For vessels a link to the owner which would refer to a lookup table where at least the id and name of the owner is recorded. If we want to be really smart we would allow for links between owners (parent -> daughter) by adding a parent_id which links to the company id within the same table.
[x] Areacell depth
[x] Translate demo scripts into vignettes (and leave the tests as hidden parts)
[x] Add schemaspy generated schema docs into general docs build