n1kdo / n1mm_view

Real-time statistics viewer for N1MM+ on Field Day
BSD 2-Clause "Simplified" License
32 stars 13 forks source link

Using the N1MM ID field for the duplicate QSO detection #39

Closed ny4i closed 1 year ago

ny4i commented 2 years ago

N1MM now sends an ID in the contacting record. This appears to be a 32 byte GUID. I suggest that the duplicate detection can be simplified to use the ID to lookup the contact. This is important because the duplicate table is made up of the following fields:

def checksum(data):
    """
    generate a unique ID for each QSO.
    this is using md5 rather than crc32 because it is hoped that md5 will have less collisions.
    """
    hval = data['timestamp'] + data['rxfreq'] + data['txfreq'] + data['operator'] + data['mode'] + data['call']
    return int(hashlib.md5(hval.encode()).hexdigest(), 16)

But the deletes record from N1MM does not send everything in the hash.

From the N1MM UDP Documentation:

<?xml version="1.0" encoding="utf-8"?>
<contactdelete>
    <app>N1MM</app>
    <timestamp>2020-01-17 16 :43:38</timestamp>
    <call>WlAW</call>
    <contestnr>73</contestnr>
    <StationName>CONTEST-PC</StationName>
    <ID>a1b2c3d4e5f6g7h</ID>
</contactdelete>

So when a contact is corrected and N1MM sends the contact delete and then contact replace, the collector cannot delete the entry from the duplicates hash table.

If it uses the ID field, then it can be deleted.

If it is not deleted, when N1MM sends the contact replace packet (which has the same fields as contactinfo, it would flag as a duplicate unless the timestamp of the contact or callsign were changed.

So by just using ID for duplicates detection, it would allow the deletion/replace change process to work.

Where this is really important is that if N1MM resends all the contacts to UDP, it sends out contact delete and contact replace packets.

This is the sequence sent by N1MM on a rebroadcast:

'<?xml version="1.0" encoding="utf-8"?>\r<contactdelete>\r\t<app>N1MM</app>\r\t<timestamp>2022-06-27 19:45:53</timestamp>\r\t<call>W2IU</call>\r\t<contestnr>22</contestnr>\r\t<StationName>DESKTOP-4K20B9M</StationName>\r\t<ID>fcbdeb9c-c69d-4479-90b1-2944b6cd76be</ID>\r</contactdelete>

2022-06-27 19:59:36.086 INFO DELETEQSO: W2IU, timestamp = 1656359153

Then it sends the contact replace

<?xml version="1.0" encoding="utf-8"?>\r<contactreplace>\r\t<app>N1MM</app>\r\t<contestname>FD</contestname>\r\t<contestnr>22</contestnr>\r\t<timestamp>2022-06-27 19:45:53</timestamp>\r\t<mycall>W4TA</mycall>\r\t<band>14</band>\r\t<rxfreq>1420000</rxfreq>\r\t<txfreq>1420000</txfreq>\r\t<operator>NY4I</operator>\r\t<mode>USB</mode>\r\t<call>W2IU</call>\r\t<countryprefix>K</countryprefix>\r\t<wpxprefix>W2</wpxprefix>\r\t<stationprefix>W4TA</stationprefix>\r\t<continent>NA</continent>\r\t<snt>59</snt>\r\t<sntnr>2</sntnr>\r\t<rcv>59</rcv>\r\t<rcvnr>0</rcvnr>\r\t<gridsquare> </gridsquare>\r\t<exchange1>1E</exchange1>\r\t<section>NLI</section>\r\t<comment></comment>\r\t<qth></qth>\r\t<name></name>\r\t<power></power>\r\t<misctext> </misctext>\r\t<zone>5</zone>\r\t<prec></prec>\r\t<ck>0</ck>\r\t<ismultiplier1>0</ismultiplier1>\r\t<ismultiplier2>0</ismultiplier2>\r\t<ismultiplier3>0</ismultiplier3>\r\t<points>1</points>\r\t<radionr>1</radionr>\r\t<run1run2>1</run1run2>\r\t<RoverLocation> </RoverLocation>\r\t<RadioInterfaced>0</RadioInterfaced>\r\t<NetworkedCompNr>0</NetworkedCompNr>\r\t<IsOriginal>True</IsOriginal>\r\t<NetBiosName>DESKTOP-4K20B9M</NetBiosName>\r\t<IsRunQSO>0</IsRunQSO>\r\t<StationName>DESKTOP-4K20B9M</StationName>\r\t<ID>fcbdeb9cc69d447990b12944b6cd76be</ID>\r\t<IsClaimedQso>1</IsClaimedQso>\r</contactreplace>

And then in the collector log, we get a duplicate message because the item was not deleted upon the contact delete.

2022-06-27 19:59:36.118 DEBUG duplicate message

ny4i commented 2 years ago

I'm glad to code this up but wanted to see if you agree Jeff before I do.

ny4i commented 2 years ago

Correction. It is 16 bytes. But it's sent as hex. From the website,

ID is a 16 byte unique GUID identifier for each contact in the log. Note that it is sent as 2 hex characters per byte.

n1kdo commented 2 years ago

This sounds great to me. Calculating the hash is way more expensive than simple string comparison, so this sounds like a win. Please proceed.

KD4Z commented 2 years ago

This seems timely as Tom sent out a notification this morning stating that as of July 5 build, the "contactdelete" message will no longer be sent and it is now advisable to use "ID GUID" to identify rows.

ny4i commented 2 years ago

Yes that was my doing. I noticed the packets came out of order so the combination of contact delete/replace and the seen set were not very useful. The ID GUID allows it to work better. After discussing it with the team, they opted to skip the delete/replace sequence since that predated the ID GUID.

My thought was that double packets are rare so I just change the database code to do an INSERT or REPLACE and made ID NOT NULL and UNIQUE.

I am just using the database as the duplicate detection mechanism as seen did not seem to be adding anything. Again, simply because in my experience, double messages are rare (unless one does a rebroadcast or has one station set to Broadcast All). The overhead is not too great in that case.

cursor.execute( 'insert or replace into qso_log \n' ' (timestamp, mycall, band_id, mode_id, operator_id, station_id , rx_freq, tx_freq, \n' ' callsign, rst_sent, rst_recv, exchange, section, comment, msgID)\n' ' values (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?);', (calendar.timegm(timestamp), mycall, band_id, mode_id, operator_id, station_id, rx_freq, tx_freq, callsign, rst_sent, rst_recv, exchange, section, comment, msgID))

I have some significant changed in thew works to support all of this. I am finalizing my testing and will have a pull request by the weekend.

Tom NY4I

On Jun 29, 2022, at 8:50 AM, KD4Z @.***> wrote:

This seems timely as Tom sent out a notification this morning stating that as of July 5 build, the message will no longer be sent and it is now advisable to use GUID to identify rows.

— Reply to this email directly, view it on GitHub https://github.com/n1kdo/n1mm_view/issues/39#issuecomment-1169940359, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC6TWSW2RODO7S5CULXP3GDVRRBBDANCNFSM5Z7UWHRQ. You are receiving this because you authored the thread.

ny4i commented 2 years ago

Once N1MM sends out the release on Tuesday (or grab the experimental version), the code in my fork can be used to test it.

If anyone wants to test this further, I would appreciate it. It works here OK.

https://github.com/ny4i/n1mm_view https://github.com/ny4i/n1mm_view

Thanks,

Tom NY4I

On Jun 29, 2022, at 8:59 AM, Thomas Schaefer @.***> wrote:

Yes that was my doing. I noticed the packets came out of order so the combination of contact delete/replace and the seen set were not very useful. The ID GUID allows it to work better. After discussing it with the team, they opted to skip the delete/replace sequence since that predated the ID GUID.

My thought was that double packets are rare so I just change the database code to do an INSERT or REPLACE and made ID NOT NULL and UNIQUE.

I am just using the database as the duplicate detection mechanism as seen did not seem to be adding anything. Again, simply because in my experience, double messages are rare (unless one does a rebroadcast or has one station set to Broadcast All). The overhead is not too great in that case.

cursor.execute( 'insert or replace into qso_log \n' ' (timestamp, mycall, band_id, mode_id, operator_id, station_id , rx_freq, tx_freq, \n' ' callsign, rst_sent, rst_recv, exchange, section, comment, msgID)\n' ' values (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?);', (calendar.timegm(timestamp), mycall, band_id, mode_id, operator_id, station_id, rx_freq, tx_freq, callsign, rst_sent, rst_recv, exchange, section, comment, msgID))

I have some significant changed in thew works to support all of this. I am finalizing my testing and will have a pull request by the weekend.

Tom NY4I

On Jun 29, 2022, at 8:50 AM, KD4Z @. @.>> wrote:

This seems timely as Tom sent out a notification this morning stating that as of July 5 build, the message will no longer be sent and it is now advisable to use GUID to identify rows.

— Reply to this email directly, view it on GitHub https://github.com/n1kdo/n1mm_view/issues/39#issuecomment-1169940359, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC6TWSW2RODO7S5CULXP3GDVRRBBDANCNFSM5Z7UWHRQ. You are receiving this because you authored the thread.

n1kdo commented 2 years ago

I looked at your branch. Can you name the new column 'n1mm_id' or 'qso_id' because I don't think it is really the msgID.

Otherwise I like it. Thank you! Jeff

ny4i commented 2 years ago

Sure

Principal Solutions Architect Better Software Solutions, Inc. 727-437-2771

On Jul 1, 2022, at 1:19 PM, Jeff Otterson @.***> wrote:

 I looked at your branch. Can you name the new column 'n1mm_id' or 'qso_id' because I don't think it is really the msgID.

Otherwise I like it. Thank you! Jeff

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.

ny4i commented 2 years ago

Just an update. We are moving so I had to stop work on this project. I have this code but have to complete its testing. I will get to that after we move in September or October. It will be done well in time for Winter Field Day.

n1kdo commented 1 year ago

Winter Field Day is coming! I hope your move went well, Tom.

One of the changes I am going to add at some time is the ability to control which displays are shown. Feedback from one local club is that they were interested in the raw scores, but not so much the per-operator scores. Personally, I like the per-operator scores, but the plan is to make which panels are shown (and perhaps for how long) configurable.

73, Happy New Year.

ny4i commented 1 year ago

Hi. I was just thinking about this project as I found my PC after misplacing it.

That sounds like a great change. I will take a look this weekend to catch up on what I was working on before I moved.

Thanks,

Tom

On Dec 31, 2022, at 2:24 PM, Jeff Otterson @.***> wrote:

Winter Field Day is coming! I hope your move went well, Tom.

One of the changes I am going to add at some time is the ability to control which displays are shown. Feedback from one local club is that they were interested in the raw scores, but not so much the per-operator scores. Personally, I like the per-operator scores, but the plan is to make which panels are shown (and perhaps for how long) configurable.

73, Happy New Year.

— Reply to this email directly, view it on GitHub https://github.com/n1kdo/n1mm_view/issues/39#issuecomment-1368268411, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC6TWSWRI5ZDVTVZLWJO47TWQCB6JANCNFSM5Z7UWHRQ. You are receiving this because you authored the thread.

n1kdo commented 1 year ago

Thank you for your code contribution. This issue seems to now be fixed.