jbrzusto / TO_DO

sensorgnome / motus TODO list for jbrzusto
0 stars 0 forks source link

Hourly plots not showing new data #139

Closed zcrysler closed 6 years ago

zcrysler commented 6 years ago

Two receivers have recently had data uploaded but the hourly plots show no detections, antenna or gps activity. The raw data has files written for every hour/day and I would be surprised if there was nothing to even show antenna activity for months at a time:

Job status 866596: Mosaic Port Maitland SG−5113BBBK2799 project 1, data uploaded for 2017-06-26 to 2017-11-07. Data in the plots only goes to 2017-04-14 https://sensorgnome.org/download/1/SG-5113BBBK2799-2016_Motus_Mosaic%20Port%20Maitland_hourly_tags.png

Job status 884433: Lakeview Landfill SG-3214BBBK5554 project 48 (but getting placed in project 1 because the metadata/project was changed since initial upload), data uploaded for 2017-07-10 to 2017-12-07: https://sensorgnome.org/download/1/SG-3214BBBK5554-2017_Motus_Lakeview%20Landfill_hourly_tags.pdf

jbrzusto commented 6 years ago

SG-5113BBBK2799: something is wrong on that receiver's internal OS image: it is writing files with bootcounts "000NaN" instead of an integer. This is apparently true for all files from 2017-07-10 onward, judging from the uploaded archive. Here's part of the listing from ross.wood:2017-12-08T02-34-13.604:PortMaitlandDec7.7z:

2017-04-10 12:07:48 ....A           79               Port Maitland Dec7/2000-01-01/Mosaic-5113BBBK2799-002135-2000-01-01T00-00-37.7160P-all.txt.gz
2017-04-13 10:52:30 ....A           80               Port Maitland Dec7/2000-01-01/Mosaic-5113BBBK2799-002204-2000-01-01T00-00-37.7790P-all.txt.gz
2017-07-10 01:20:20 ....A         2744               Port Maitland Dec7/2017-07-10/Mosaic-5113BBBK2799-000NaN-2017-07-10T00-20-20.4530T-all.txt.gz
2017-07-10 02:20:20 ....A         2736               Port Maitland Dec7/2017-07-10/Mosaic-5113BBBK2799-000NaN-2017-07-10T01-20-20.5500T-all.txt.gz
2017-07-10 03:20:20 ....A         2597               Port Maitland Dec7/2017-07-10/Mosaic-5113BBBK2799-000NaN-2017-07-10T02-20-20.6530T-all.txt.gz
2017-07-10 04:20:20 ....A        15759               Port Maitland Dec7/2017-07-10/Mosaic-5113BBBK2799-000NaN-2017-07-10T03-20-20.7210T-all.txt.gz

Because these filenames don't parse correctly, they are ignored. (that's what the part of the log message saying "ignoring files for which I can't determine the receiver" means; the message really means it can't parse the filenames).

This needs to be fixed manually. I have renamed the files with 000NaN boot counts to have new valid boot counts starting at 002067 (the first bootcount not seen in any files). For these files, after each uncompressed .txt file in the archive, I've bumped up the bootcount for subsequent files, since a reboot (or download) is the usual reason an uncompressed file persists. (Normally, the compressed version supersedes the uncompressed version, once a time or size threshold has been reached; e.g. every 1 hour or 1 megabyte).

I've updated the uploaded archive with corrected filenames, and the new uploaded file is currently running as job 887158.

Also, the receiver itself needs maintenance. I've filed an issue for that as MotusDev/Motus-TO-DO#124


SG-3214BBBK5554: the last data upload only went up to Nov. 7, and that's where the plot ends:

The last part of the listing from the uploaded archive zcrysler:2017-12-13T21-53-13.088:lakeview_nov7.7z is:

2017-11-07 19:47:00 ....A           50               Lakeview Nov7/2017-11-07/lakeview-3214BBBK5554-000028-2017-11-07T19-47-00.6760Y-all.txt.gz
2017-11-07 19:47:00 ....A           49               Lakeview Nov7/2017-11-07/lakeview-3214BBBK5554-000028-2017-11-07T19-47-00.8000X-all.txt.gz
2017-11-07 19:47:01 ....A           49               Lakeview Nov7/2017-11-07/lakeview-3214BBBK5554-000028-2017-11-07T19-47-00.8920W-all.txt.gz
2017-11-07 19:48:51 ....A        16208               Lakeview Nov7/2017-11-07/lakeview-3214BBBK5554-000028-2017-11-07T19-47-01.0170V-all.txt
2017-11-07 19:47:01 ....A           10               Lakeview Nov7/2017-11-07/lakeview-3214BBBK5554-000028-2017-11-07T19-47-01.0170V-all.txt.gz
------------------- ----- ------------ ------------  ------------------------
2017-11-07 19:54:18          858323430    857954303  4550 files, 133 folders
zcrysler commented 6 years ago

For receiver SG-3214BBBK5554, the last upload goes to Nov. 7th, but the plot shows nothing between July-Nov and the files I have look like data was recorded for all those months, was there really nothing there, not even false positives for antenna readings?

jbrzusto commented 6 years ago

The run for boot session 25 ran into a tag finder bug. I've requeued a run for that session to see whether that makes any difference. That was in job 785756. Once a file has been processed, even if it generates an error, re-uploading it does nothing. The new status interface will have a way to deal with errors in jobs.

j-sayers commented 4 years ago

I'm just looking at the data from this receiver and preparing it for upload. John's script assumes there is at least one correct boot session number in the file names, and then renames from that starting point.

However there is not a single valid boot session number in any of the files that were downloaded from the receiver. My next thought was to look at the last upload, grab the last boot session number (which is 2207), rename the first file in this series with that boot number and from there.

But I have a few questions:

j-sayers commented 4 years ago

So it turns out that there were quite a few files that somehow got missed in the previous upload. That's what accounted for the gaps in the boot session numbers.

Thankfully the original data was all still accessible elsewhere so I was able to find the actual most recent boot number (and use it to run John's script) and also upload a host of files that somehow got missed earlier.

The upshot is that all the data since late 2017 is now finally uploaded, resulting in quite a few new detections.

StuMackenzie commented 4 years ago

If there were detections from one project in particular, it may be worth a quick note to the collaborators saying that we've recently uncovered some 'lost' data and you have more detections from these sites. FYI