StoXProject / RstoxData

R library for reading various biotic and acoustic data formats
https://stoxproject.github.io/RstoxData/
GNU Lesser General Public License v3.0
1 stars 1 forks source link

unclear warning StoxBiotic #267

Open edvinf opened 1 year ago

edvinf commented 1 year ago

When converting to StoxBiotic a warning is issued if there are several serialnumbers for a station i NMDbiotic:

"There are more than one 'serialnumber' (HaulKey in StoxBioticData) for 113 out of 243 'station'(StationKey in StoxBioticData) in the NMDBiotic data. In DefineBioticAssignment() it is currently only possible to asssing all hauls of a station in the map (manual assignment). If certain Hauls should be exclcuded, use FilterStoxBiotic(). Duplicated serialnumber for the following cruise/station (of the fishstation table of the BioticData): <...list of cruise / station ids>"

The last line in the warning refers to duplicated serialnumbers, which makes it sound like serialnumbers are not unique in the source data. Should probably say "Several serialnumbers for the following ..."

Secondly, the error message refers to cruise/station as if these are fields in the fishstationtable of BioticData. That is true for 'station', but 'cruise' is a field on the table 'mission' in NMDbiotic.

There is also a typo: 'asssing' --> 'assigning' ?

BergenCalling commented 1 year ago

Will these do? There are no processes between the two biotic_year_2022_species_164744.xml_agedetermination.txt Individual.txt

edvinf commented 1 year ago

The warning message also oddly lists a unique cruise/station numbers where these apply, but the 'cruise'-part does not seem to be idenitfying a cruise in neither Biotic or StoxBiotic. Particularly for commercial fisheries data, the 'cruise'-variable is typically NA. First of all this makes it difficult for users to trace which stations are actually probalematic. Secondly, it may wrongly identify them altogether. Station-numbers are only supposed to be unique for a given mission in NMDbiotic.

edvinf commented 1 year ago

I also think the reference to DefineBioticAssignment can be dropped here, since issues with incorrect use of stationnumbers also create problems for many other kinds of downstream analysis. It should be easy to check in DefineBioticAssignement if it is processing data with several hauls for any station and issue a warning there.