NEFSC / READ-SSB-Lee-KA_Scallop

Other
0 stars 0 forks source link

DLR_ORPHAN_TRIP and PZERO #6

Open kmaspelund opened 4 months ago

kmaspelund commented 4 months ago

Hi Min-Yang,

I wanted to be sure I understand two orphan issues (separate from the VTR orphans issue you've highlighted).

Below I focus only on scallop observations in CAMS (itis_tsn == 79718).

Here are the statuses of observations that do not merge between the subtrips and land datasets. As you said, these are observations for boats without scallop permits. I did the merge exactly as you described.

image

They are mostly made up of

This is 85,176 observations or about 15% of all scallop observations across unmerged and merged observations (there are 465,606 merged scallop observations). I wanted to ask because it seemed like a negligible share.

So am I right that the interpretation is the following: the scallop observations that do not merge from the sub-trip data are ones where permit IDs are not recorded (and therefore they cannot be merged back to trips) or the dealer flagged a scallop purchase that cannot be connected back to a trip. This is not some story of a bunch of mystery scallop trips by non-permitted boats.

To that end, if you can point me to something more about the PZERO status, that would be great!

Thanks so much! Really grateful for all the work you put in up front in getting this prepared.

Karl

mle2718 commented 4 months ago

@kmaspelund

  1. These datasets are changing relatively quickly as bugs are found and fixed. For example, camsid 114716_20121103113000_4075676 looks different in the "live" data today compared to the data that I just extracted for you. it is a little frustrating, but the data quality will continue to get better. It is still useful to get a flavor of the data and explore the main points, but don't get hung up on tiny details (for example how do we get _merge=2 and status="MATCH"?!?!? --that shouldn't be possible). Plus, the code you write will work, you'll just pass in a new dataset.

We are supposed to get data refreshes on overnight on Thursdays, in time for Friday AM at the start of business. It should be relatively easy for me to re-run my code and send over updated data.

The easy answer is that these are observations where the vessel did not have a federal permit. The states of Maine and Massachusetts have a decent state-waters fishery for scallops. Maine more so that Mass. You are not modeling the behavior of these vessels, so you can safely drop all records with STATUS==PZERO. You could retain them if you need to construct a dataset of prices. note the relatively low landings for these rows.

Data earlier in the time series will also has a little bit of slop/ less QA/QC. I think, but I am not certain, that there was a little more state-waters fishing earlier (say pre 2010).

/* I have my data stored in the folder $data_main */
global data_vintage 2024_03_21
use $data_main/cams_subtrip_${data_vintage}.dta
merge  1:m camsid subtrip using $data_main/cams_land_${data_vintage}
keep if itis_tsn==79718

/*reproduce Karl's table */
keep if _merge==2
tab status

browse if status=="PZERO"
tab state if status=="PZERO"
tab year state if status=="PZERO"
summ lndlb if status=="PZERO", detail
centile lndlb if status=="PZERO", centile(1 2 5 10 25 50 90 95 99)

So am I right that the interpretation is the following: the scallop observations that do not merge from the sub-trip data are ones where permit IDs are not recorded (and therefore they cannot be merged back to trips) or the dealer flagged a scallop purchase that cannot be connected back to a trip. This is not some story of a bunch of mystery scallop trips by non-permitted boats.

Well, they aren't federally permitted. But they are a mystery, in the sense that I don't have any trip-level info on them.

dealer flagged a scallop purchase that cannot be connected back to a trip I am unsure what you mean by this----is this wrt the the DLR_ORPHAN_TRIP rows?

mle2718 commented 4 months ago

for example how do we get _merge=2 and status="MATCH"?!?!? --that shouldn't be possible

This is happening because I have pulled All of the landings commercial landings data but only trip data for a subset of commercial vessels that have ever held a federal scallop permit from 2003 to present.

Some of the _merge=2 is happening because

  1. No federal permit (Pzero)
  2. didn't have a federal scallop permit (most of the status=MATCH). The rest of the reasons are in the CAMS documentation.
kmaspelund commented 4 months ago

SUPER helpful Min-Yang. Thanks! Yes, documentation is very helpful too. More to come...