rBatt / trawl

Analysis of scientific trawl surveys of bottom-dwelling marine organisms
1 stars 0 forks source link

In NEUS, is "wtcpue" really just biomass? Effort? #17

Closed rBatt closed 9 years ago

rBatt commented 9 years ago

For the NEUS data, we are basically supplied a .RData file w/ data.tables. This data.table has columns for BIOMASS and ABUNDANCE. But these never appear to be divided by any measure of effort, nor can I find an appropriate measure of effort. These columns are not referenced in the meta data file supplied by Malin.

The scripts I use to read in data are very closely based on Malin's scripts. In my scripts, the data are read in via an .RData file. This can be seen on line 34. On line 76, you can see that I sum over the abundance and biomass to ignore differences between genders. On line 102 I rename the biomass/ abundance columns as wtcpue/ cntcpue, but I never divided by an effort.

I checked my code against the code Malin gave me, and against the code being used for the website. In both cases, the procedure is almost identical. It appears important that the origins of the data we see are in code by "Sean Lucy", who I do not know. In this Sean Lucy script, it appears as though a separate SQL database is queried, catch values are corrected for gear changes, and then the output saved as .RData. Based on line 9, it seems as though this database is located on some sort of local server or drive (local relative to Sean Lucy's computer); thus, this code would not work for me (the data don't appear to be on the web). Also, I tried running this script and got some "Windows only" errors, and then the first query appeared to be timing out.

@mpinsky might have ideas concerning 1) are these data corrected for effort? 2) can we get effort? 3) can we talk to Sean Lucy?

mpinsky commented 9 years ago

As far as I have been led to understand, the data are corrected for effort. I don't know the units. Sean Lucey's email is sean.lucey@noaa.gov. I've emailed with him in the past and he has been very helpful and responsive. For example, he's agreed to send us the data updates every year.

And yes, he runs the script from behind the NEFSC firewall to extract the data from their ORACLE database. We don't have direct access.

rBatt commented 9 years ago

The biomass might be per effort, but the ABUNDANCE column contains integers ... unlikely to be counts/effort, right?

On Wed, Jan 21, 2015 at 10:37 PM, mpinsky notifications@github.com wrote:

As far as I have been led to understand, the data are corrected for effort. I don't know the units. Sean Lucey's email is sean.lucey@noaa.gov. I've emailed with him in the past and he has been very helpful and responsive. For example, he's agreed to send us the data updates every year.

And yes, he runs the script from behind the NEFSC firewall to extract the data from their ORACLE database. We don't have direct access.

— Reply to this email directly or view it on GitHub https://github.com/rBatt/trawl/issues/17#issuecomment-70966835.

rBatt commented 9 years ago

Also, in his scripts he refers to to data being "corrected"; I could check again, but I think that's a reference to gear corrections etc. But again, I should check.

Moreover, if abundance is "corrected" for any sort of change, it shouldn't appear as an integer.

On Wed, Jan 21, 2015 at 11:18 PM, Ryan Batt battrd@gmail.com wrote:

The biomass might be per effort, but the ABUNDANCE column contains integers ... unlikely to be counts/effort, right?

On Wed, Jan 21, 2015 at 10:37 PM, mpinsky notifications@github.com wrote:

As far as I have been led to understand, the data are corrected for effort. I don't know the units. Sean Lucey's email is sean.lucey@noaa.gov. I've emailed with him in the past and he has been very helpful and responsive. For example, he's agreed to send us the data updates every year.

And yes, he runs the script from behind the NEFSC firewall to extract the data from their ORACLE database. We don't have direct access.

— Reply to this email directly or view it on GitHub https://github.com/rBatt/trawl/issues/17#issuecomment-70966835.

mpinsky commented 9 years ago

True, unless abundance also gets rounded (though that would seem odd). I do believe that the corrections refer to gear and vessel changes.

rBatt commented 9 years ago

question essentially re-asked in #53 (I forgot about this issue), but the comments in that issue are more informative and actually answer this question (no, the neus is not correct for effort, we need to ask for area swept). So we know the answer, and how to fix it (ask for area in the next data request). There might be other regions with a similar issue.

mpinsky commented 9 years ago

Also note the rest of the answer from Sean:

For older data, we tend to assume a standard swept area of 0.0384 km^2. For a good majority of the survey, we did not have net mensuration sensors to calculate the actual tow footprint. Since the Bigelow has come on-line, we do have those sensors and can have a much more accurate estimate of swept area by individual tow. That being said, the data I provided is converted to Albatross numbers so calculated an actual swept area for the Bigelow would be misleading.

On Monday, May 18, 2015, Ryan Batt <notifications@github.com javascript:_e(%7B%7D,'cvml','notifications@github.com');> wrote:

Closed #17 https://github.com/rBatt/trawl/issues/17.

— Reply to this email directly or view it on GitHub https://github.com/rBatt/trawl/issues/17#event-307643970.

Please excuse, sent from a device with tiny keys...

rBatt commented 9 years ago

Right – he's implying the area swept is pretty standard, and that it'd be hard for us to make an equivalent area correction across the whole data set. Is that what you're saying? I think for now the numbers are probably pretty good, but until I look into it in detail I'm just assuming that it could still be an issue.

On Tue, May 19, 2015 at 12:07 PM, mpinsky notifications@github.com wrote:

Also note the rest of the answer from Sean:

For older data, we tend to assume a standard swept area of 0.0384 km^2. For a good majority of the survey, we did not have net mensuration sensors to calculate the actual tow footprint. Since the Bigelow has come on-line, we do have those sensors and can have a much more accurate estimate of swept area by individual tow. That being said, the data I provided is converted to Albatross numbers so calculated an actual swept area for the Bigelow would be misleading.

On Monday, May 18, 2015, Ryan Batt <notifications@github.com javascript:_e(%7B%7D,'cvml','notifications@github.com');> wrote:

Closed #17 https://github.com/rBatt/trawl/issues/17.

— Reply to this email directly or view it on GitHub https://github.com/rBatt/trawl/issues/17#event-307643970.

Please excuse, sent from a device with tiny keys...

— Reply to this email directly or view it on GitHub https://github.com/rBatt/trawl/issues/17#issuecomment-103568738.

mpinsky commented 9 years ago

Exactly. We can divide all mass numbers by the standard swept area of a trawl to get density estimates, but it doesn't sound like we can get tow-by-tow swept area measurements to make tow-specific corrections.

On Tuesday, May 19, 2015, Ryan Batt notifications@github.com wrote:

Right – he's implying the area swept is pretty standard, and that it'd be hard for us to make an equivalent area correction across the whole data set. Is that what you're saying? I think for now the numbers are probably pretty good, but until I look into it in detail I'm just assuming that it could still be an issue.

On Tue, May 19, 2015 at 12:07 PM, mpinsky <notifications@github.com javascript:_e(%7B%7D,'cvml','notifications@github.com');> wrote:

Also note the rest of the answer from Sean:

For older data, we tend to assume a standard swept area of 0.0384 km^2. For a good majority of the survey, we did not have net mensuration sensors to calculate the actual tow footprint. Since the Bigelow has come on-line, we do have those sensors and can have a much more accurate estimate of swept area by individual tow. That being said, the data I provided is converted to Albatross numbers so calculated an actual swept area for the Bigelow would be misleading.

On Monday, May 18, 2015, Ryan Batt <notifications@github.com javascript:_e(%7B%7D,'cvml','notifications@github.com'); <javascript:_e(%7B%7D,'cvml','notifications@github.com javascript:_e(%7B%7D,'cvml','notifications@github.com');');>> wrote:

Closed #17 https://github.com/rBatt/trawl/issues/17.

— Reply to this email directly or view it on GitHub https://github.com/rBatt/trawl/issues/17#event-307643970.

Please excuse, sent from a device with tiny keys...

— Reply to this email directly or view it on GitHub https://github.com/rBatt/trawl/issues/17#issuecomment-103568738.

— Reply to this email directly or view it on GitHub https://github.com/rBatt/trawl/issues/17#issuecomment-103579875.

Please excuse, sent from a device with tiny keys...