Open wx5s opened 10 years ago
On 4/15/2014 1:41 PM, wx5s wrote:
I think that it is possible for CQP entries to be described accurately within the existing Cabrillo specs and I would be happy to work on that issue.
Be aware that it will take years, perhaps many years for logging authors to implement this (if ever). However I think that it would be wise for us (CQP) to publish such a spec and to use it for our log acceptance and our generated Cabrillo files.
K6DGW: Agree that it is wise to define the CQP specification and get it published with all the others ... I don't know how that publishing occurs however. I really don't think it wise to put that on the critical path for this project however cause it isn't going to happen in time and logger authors are surely not going to begin building compliant logs.
K6DGW: Depending on a defined Cabrillo 3 format is also against the apparent goal that whatever comes out of this effort will be usable by other contest sponsors
There are 2 main issues: (1) At its heart, Cabrillo is a white space separated format. There are not really any "column" definitions for QSO lines. The QSO templates are pretty much fiction and the normal QSO: parsers do not use these templates - they just separate the tokens based upon white-space.
K6DGW: One goal of Cabrillo was that it be readable and editable for humans. ADIF, while a great idea and used by many systems is neither.
This is what drove the CQP change about county lines years ago. Received QTH of "SCLA SMAT" won't work because that is viewed as 2 tokens, not one! Trying to "heard these ducks into a row" just wouldn't work. Trying to get log authors to put: "SCLA/SMAT" into the log when the users enters "SCLA SMAT" was not judged as feasible and how the log authors keep track of mults and understand this multi-mult QTH is hard. This decision of course could be re-visited. Heck anything is theoretically possible. However, I don't think that would be a good idea and is "above my pay grade".
(2) CQP has special subcategories (like YL, etc). There are ways to do this within the standard Cabrillo tags. I think that we should do that. There is for example: CATEGORY-OVERLAY: The tag, "CATEGORY-OVERLAY:" is owned by N6TV, but we could put "YL" into the options allowed for that. Getting some new tag is a "very heavy lift". Adding some new contest specific option of "YL" is much easier and is within the philosophy of Cabrillo V3. That is the "right place" for this option to go.
K6DGW: The way to handle this and keep it all within the CQP Team's control is to gather that information from the operator when he/she is submitting the log -- specific fields/check boxes on the web form that have to be answered or filled with valid data before acceptance of the log. When the log is submitted, it gets saved, AND a valid CQP Cabrillo 3 log gets created and that's the one we work with.
K6DGW: A lot of problems in QSO lines can also be dealt with during the
engagement. Reject isn't the only option. If an rQTH does not match
any of the aliases for the CA counties or state/province abbreviations,
the submittal site can ask the op about it ... "In QSO:
from prior written spec, note official Cabrillo name for contest: NCCC-CQP
Template
-----info sent------- ----info
rcvd-------QSO: freq mo date time call nr qth call nr qthQSO: * \ yyyy-mm-dd nnnn ****** nnnn aaaa ****** nnnn aaaaQSO: 21042 CW 1997-11-01 2102 N6RNO 3 TEHA K9ZO 2 IL000000000111111111122222222223333333333444444444455555555556666666666777123456789012345678901234567890123456789012345678901234567890123456789012
-
mo: mode CW or PH
Example Log
START-OF-LOG: 3.0
LOCATION:TEHAMA
CALLSIGN: N6RNO
CLUB: Northern California Contest Club
CONTEST: NCCC-CQP
CATEGORY-OPERATOR: MULTI-OP
CATEGORY-BAND: ALL
CATEGORY-POWER: HIGH
CATEGORY-MODE: MIXED
CATEGORY-STATION: EXPEDTION
CLAIMED-SCORE: 60
OPERATORS: N3ZZ K9YC K6MI N6RNO NO6X WB6HYD K6VLF
NAME: Jim Brown
ADDRESS: 599 DX RD
ADDRESS: Karma, CA 95990
ADDRESS: USA
CREATED-BY: N1MM Logger V9.9.7
QSO: 14026 CW 2009-10-03 1605 N6RNO 0001 TEHA N3UM 0001 MD
QSO: 14032 CW 2009-10-03 1608 N6RNO 0002 TEHA NE8J 0001 FL
QSO: 14032 CW 2009-10-03 1609 N6RNO 0003 TEHA K1IB 0001 VT
QSO: 21327 PH 2009-10-03 1609 N6RNO 0004 TEHA W8MJ 0016 MI
QSO: 14032 CW 2009-10-03 1609 N6RNO 0005 TEHA W5PQ 0001 LA
END-OF-LOG:
Rick "The Rhino" N6RNO @Tehama October 4-5, 2014 Where will you be?
On Wed, Apr 16, 2014 at 2:17 AM, K6DGW notifications@github.com wrote:
On 4/15/2014 1:41 PM, wx5s wrote:
I think that it is possible for CQP entries to be described accurately within the existing Cabrillo specs and I would be happy to work on that issue.
Be aware that it will take years, perhaps many years for logging authors to implement this (if ever). However I think that it would be wise for us (CQP) to publish such a spec and to use it for our log acceptance and our generated Cabrillo files.
K6DGW: Agree that it is wise to define the CQP specification and get it published with all the others ... I don't know how that publishing occurs however. I really don't think it wise to put that on the critical path for this project however cause it isn't going to happen in time and logger authors are surely not going to begin building compliant logs.
K6DGW: Depending on a defined Cabrillo 3 format is also against the apparent goal that whatever comes out of this effort will be usable by other contest sponsors
There are 2 main issues: (1) At its heart, Cabrillo is a white space separated format. There are not really any "column" definitions for QSO lines. The QSO templates are pretty much fiction and the normal QSO: parsers do not use these templates - they just separate the tokens based upon white-space.
K6DGW: One goal of Cabrillo was that it be readable and editable for humans. ADIF, while a great idea and used by many systems is neither.
This is what drove the CQP change about county lines years ago. Received QTH of "SCLA SMAT" won't work because that is viewed as 2 tokens, not one! Trying to "heard these ducks into a row" just wouldn't work. Trying to get log authors to put: "SCLA/SMAT" into the log when the users enters "SCLA SMAT" was not judged as feasible and how the log authors keep track of mults and understand this multi-mult QTH is hard. This decision of course could be re-visited. Heck anything is theoretically possible. However, I don't think that would be a good idea and is "above my pay grade".
(2) CQP has special subcategories (like YL, etc). There are ways to do this within the standard Cabrillo tags. I think that we should do that. There is for example: CATEGORY-OVERLAY: The tag, "CATEGORY-OVERLAY:" is owned by N6TV, but we could put "YL" into the options allowed for that. Getting some new tag is a "very heavy lift". Adding some new contest specific option of "YL" is much easier and is within the philosophy of Cabrillo V3. That is the "right place" for this option to go.
K6DGW: The way to handle this and keep it all within the CQP Team's control is to gather that information from the operator when he/she is submitting the log -- specific fields/check boxes on the web form that have to be answered or filled with valid data before acceptance of the log. When the log is submitted, it gets saved, AND a valid CQP Cabrillo 3 log gets created and that's the one we work with.
K6DGW: A lot of problems in QSO lines can also be dealt with during the engagement. Reject isn't the only option. If an rQTH does not match any of the aliases for the CA counties or state/province abbreviations, the submittal site can ask the op about it ... "In QSO:
with , ZMAD is not a recognized county abbreviation. Did you mean AMAD?" or "In QSO with the QTH is not recognized. Please enter it here." The submittal page(s) can't catch everything, and certainly they can't do any inter-log checking, but they can catch a lot directly from the operator at submittal time. K6DGW: As Matt points out, once that engagement with the operator is over, many will not fix and re-submit in response to an after-the-fact email. Those in contention for an award would, but then those in contention for an award probably are not going to be the problem logs in the first place. If one goal is to maximize the number of logs submitted, being friendly on the site but getting the operator to fix it will achieve that. Sending after-the-fact emails won't, and just rejecting less than perfect logs will reduce participation. > Currently this CQP sub-category of YL is done via what is called "out of > band signalling", i.e. in the SOAPBOX: tag's text. This is not the right > way to do this. > Neither is some extra tag like "X-CQP: YL". K6DGW: No, neither are. > If we make a generalized log submission gizmo, we should do it within > Cabrillo guidelines. In this case, "hey, this is such a thing as YL and > that basically is a CATEGORY-OVERLAY:". When we parse the incoming log, > we should report your YL status (Y/N) and give you a chance/directions > on how to update that. K6DGW: Well, yes. But better to just ask the op to check "YL" and/or "Under 18, or whatever. Get all the info we need at submittal time. Then thank him/her for participation. 73, Fred K6DGW - Northern California Contest Club - CU in the 2014 Cal QSO Party 4-5 Oct 2014 - www.cqp.org — Reply to this email directly or view it on GitHubhttps://github.com/tepperly/QSO-Party/issues/3#issuecomment-40556411 .
I am new to GitHub. I will try to expand and further clarify my comments via a Word doc. Unfortunately, I am unable to do it. It should be possible for me to post either a Word doc or a .PDF file, but I just don't know how to do it right now. Any hints? I have an 8 page doc that I'd like to share, but don't know how to do it. I would like to upload this doc and put a link here. Apparently only image files can be added to this kind of thread (not even a .PDF file).
The preference is that any documentation be with open source Libre ... in any event For this purpose, just attach to email and send to me.
The way to get any file into the repository would be to add it to your work area and then just commit the file. The Windows GitHub interface makes this easy (the commit part). In other words, you treat the document just like any other file in the repository.
Maybe you should create a directory cqp in your local copy of the repo and add the file there
Rick "The Rhino" N6RNO @Tehama October 4-5, 2014 Where will you be?
Ok, Rhino, sending an e-mail with Word Doc to you. Maybe we need some kind of general "discussion" sort of branch in the repository?
This doc is markup of the Cab3 spec. I get more argumentative or less focused depending on current spine pain level and meds (and I have some "good" ones). For CQP I think that whatever log acceptance gizmo we have translates whatever we get into this Cabrillo 3 compliant list of fields/tokens. ARRL/CQ for example normalize all Cabrillo 2 logs into Cabrillo 3 logs (I have the code). The rest of processing uses Cab 3. I think that we will need some CQP specific tags to explain steps in our internal processing, but that is outside what the "public spec" should be.
"note official Cabrillo name for contest: NCCC-CQP"
Yes, that is true. I had some discussions with N5KO about this in order to make that happen and this is documented in the Cabrillo spec. The general thing is "sponsor-contestName", I guess for CWO, that should be NCCC-CWO, etc for NCCC's version of the Sprint.
seems like your document is not so much a specification on what Cabrillo is so much as a requirements specification of what our software needs to do with a log file as we get it .... is this a reasonable interpretation ?
Rick "The Rhino" N6RNO @Tehama October 4-5, 2014 Where will you be?
On Thu, Apr 17, 2014 at 2:48 PM, wx5s notifications@github.com wrote:
Ok, Rhino, sending an e-mail with Word Doc to you. Maybe we need some kind of general "discussion" sort of branch in the repository?
This doc is markup of the Cab3 spec. For CQP I think that whatever log acceptance gizmo we have translates whatever we get into this Cabrillo 3 compliant list of fields/tokens. ARRL/CQ for example normalize all Cabrillo 2 logs into Cabrillo 3 logs (I have the code). The rest of processing uses Cab 3. I think that we will need some CQP specific tags to explain steps in our internal processing, but that is outside what the "public spec" should be.
— Reply to this email directly or view it on GitHubhttps://github.com/tepperly/QSO-Party/issues/3?utm_campaign=website&utm_source=sendgrid.com&utm_medium=email#issuecomment-40766780 .
Rick, Yep, I think this is close to my intention and advice.
1) Our log acceptance SW should produce a text file that completely describes the submitter's entry. 2) That text format should be Cabrillo v3(+?) (not ADIF or other formats). Other formats (ADIF) are possible, but Cabrillo appears to be the "right way". 3) This generated text format should work for all contests (a general thing).. 4) Incoming CAB V2 logs should be translated into Cabrillo 3 logs. 5) The current Cabrillo V3 spec or CQP Cabrillo official spec cannot accurately describe a CQP entry. I think that it could do so with very minor additions. (6) I would like certain field names like "YL" be added to the Cabrillo spec for CATEGORY-OVERLAY:, but this appears to be hard to do. I don't understand that.
What does ROOKIE mean? Why can't we have YL?
On 04/17/2014 07:27 PM, wx5s wrote:
Rick, Yep, I think this is close to my intention and advice.
1) Our log acceptance SW should produce a text file that completely describes the submitter's entry. 2) That text format should be Cabrillo v3(+?) (not ADIF or other formats).
Testing.
Other formats (ADIF) are possible, but Cabrillo appears to be the "right way". 3) This generated text format should work for all contests (a general thing).. 4) Incoming CAB V2 logs should be translated into Cabrillo 3 logs. 5) The current Cabrillo V3 spec or CQP Cabrillo official spec cannot accurately describe a CQP entry. I think that it could do so with very minor additions. (6) I would like for certain field names like "YL" be added to the Cabrillo spec for CATEGORY-OVERLAY:, but this appears to be hard to do. I don't understand that.
What does ROOKIE mean? Why can't we have YL?
— Reply to this email directly or view it on GitHub https://github.com/tepperly/QSO-Party/issues/3#issuecomment-40782062.
testing testing
This is a test
Cabrillo 3 for CQP:
To describe CQP, we need some extra CATEGORY-OVERLAY: field names. like for example: YL. I don't know why this sort of stuff is so controversial, but apparently it is.
I don't think that I am proposing anything that is entirely "CQP only". I think lots of contests would like to have a YL CATEGORY-OVERLAY. There are a couple of other fields. My Draft doc with comments to Cabrillo 3 explains all this and more. I've spent a lot of thought on this over the years.
If we cannot get what CQP needs documented in the official spec. We publish our own Cabrillo spec, a further edited and simplified version of my document that should be published on our CQP site. It should clearly highlight the stuff that isn't part of the official spec. Those additions are extra fields to the existing TAGS: and NOT any changes to QSO: format or new TAGS:
We could perhaps also add comments to our website on the spec, like CATEGORY-TIME: is ignored (Cabrillo cannot currently spec MAX time, the absence of this TIME TAG: indicates that). From a spec philosophy, I don't like the absence of something to implicitly mean something. But that is where we are. If this is a 48-hour contest, like ARRL DX, contest authors leave this TAG off. That is a loosing battle at this point and we should go with the flow and not worry about it.
When talking with Trey N5KO, the idea seems to be that N6TV owns the TAGS:. We should add and spec new fields to the existing tags and document them on our website. This of course is harder for us as we have to contact the logging authors ourselves and get them to add say "YL" to their software. Basically make the authors aware that there is some "extra stuff" for CQP. CQP is big enough (about 1/2 the size of ARRL SS CW), that we should have some influence on them. I do think that some (if not all) of what we want will be applicable to other QSO parties. It of course would be much better for us if N6TV would agree to add our extra Field names to the spec!
If you read my doc, you will see that for example we currently ignore "LOCATION:". People typically put in their ARRL section, but I doubt that ANY QSO party would want that. Again, I don't understand why the official spec can't say that if you are in a state QSO party, put in your in-state COUNTY if you are in-state, otherwise ARRL section if out of state. Simple obvious thing that would be welcomed by other QSO parties, but requires a minor change to the official spec.
Cabrillo is a victim of its own success! This thing has turned out to be wildly successful and is "the standard" for all contests now. Changes have to be carefully done and backward compatible with what has been done before. I understand and agree with that. I personally don't agree with the proposition that Cabrillo is now an "end of life" document that cannot be amended.
Steps: 1) generate our CQP cabrillo spec (simplified and edited version of my doc that focuses on the very few small number of things what we need past the official spec and of course deletes all of the Matt ramblings). I can do that. 2) See if we can get that added to the official spec. It would help our case if somebody could to the research and leg work to say "hey QSO Party a,b,c,d also would like for example "YL". Our chances for a successful addition to the spec is vastly improved if we can show that this stuff is NOT CQP specific. 3) Whether (2) is successful or not, our log acceptance SW generates a Cabrillo 3 file conformant with (1). Other QSO parties will need the same sort of thing, if not exactly what we need or a subset of it. So this should be a general feature. We are well served if an entry can be fully described within the Cabrillo text format (and it can!). 4) QSO processing (log check, entry classification stats, reporting etc) all proceed with a Cabrillo file. I strongly believe that a generally useful log submission tool must work within the framework of Cabrillo and produce output Cabrillo text files instead of some other format.
Implementation Note: The reality is that no matter what we do, there are always going to be entries that do not conform to the "spec". I'll just pick on "LOCATION:". SMAT county is in SCV section. It is to our advantage to preserve what the submitter said as much as possible rather than editing their entry. If the section is there instead of the county, then we create an X-LOCATION: SMAT entry. We use the "X-" version for our processing (if it exists) instead of the non "X-" version. But we can still can see what the submitter did without having to use source control to look backwards to the original entry. If we edit the LOCATION: field, we loose the original info. Same idea for other "TAGS:". The "X-" version overrides the user submitted tag, but original is preserved. If we have an interactive discussion with the user via webpage, then edit the LOCATION: field. That LOCATION: becomes the "submitted version". I'm thinking here about e-mail and other situations that occur "after the fact".
Another thought along that same vein: I would suggest as a possibility that we add something like an "X-USER-VERIFIED: YES ...blah..blah" tag when we get a log from somewhere other than e-mail (a web submission form) and the submitter agreed to our summary of "hey, this are the categories that you are entering". That means that they clicked the "OK" button or whatever. We will for a very long time continue to get e-mail submissions which are essentially "one-way" transmissions although we will reply with errors that are detected. These "X-blah:" tags are internal processing tags and not part of any publicly published spec. This "hey, we have proof that you agreed to X" is powerful stuff when the inevitable problems and disagreements occur. In CQP 2013 one guy didn't submit his log. We have proof via multiple redundant servers that that didn't happen. Having 100% confidence in stuff like that can be important (this guy would have won). Some of this seems like small details, but this sort of stuff can matter quite a bit.
Lots of words and music ... I think Matt's points are:
Regarding proof of submittal/agreement [Matt's last few sentences]: I fail to see how we can "prove" non-submittal/submittal-of-wrong-info without an audit trail that includes the submitter in the loop, regardless of how many multiple redundant servers we have. When the SUBMIT button is clicked, all of the entry data [call, entry class, entry sub-classes, address, email, sent QTH[s], number of Q's, etc.] should be reported back to the submitter along with a unique control number and a request for confirmation.
Once confirmed, that control number becomes part of the internal log header, and is also posted to an internal list matching control # to log header data and the date time of confirmation [separate from and independent of the internal Cabrillo log], and the call goes to whatever list drives the Logs Received page on the web site. If someone claims we didn't get their log and they don't have the control #, then we didn't get their log. If they do have a control #, we start scrambling to find the log. If they have a valid control number and claim we got info wrong, we can point to the independent audit trail and the info we asked them to confirm.
I posted this by clicking on "Issues". So far, it's the only way to post something beside email I have mastered.
Fred K6DGW
This reply doesn't preserve the point numbering . I don't know how to that. Gihub guru's please enlighten me!
On Sun, Apr 20, 2014 at 12:43 PM, Fred notifications@github.com wrote:
Lots of words and music ... I think Matt's points are:
1.
Define a formal, public CQP Cabrillo specification and undertake to get it formally published. This specification would include the additional CATEGORY-OVERLAY: tags CQP needs. If it gets published, logger authors would have something to work with [if so inclined], but getting it published is not on the critical path for the rest of this project.
Correct.
1.
Define an internal, non-public Cabrillo specification, compliant with the public spec but with internal-only additions to the tags to record the submitter's answers to questions posed on the web site form, items our system derives from the submitted log, etc.
Not completely exactly what I meant, but close. Everything moves towards the above Cabrillo spec.
I make a difference between interactive input and e-mails. And recommend an additional tag (like X-ACCEPTED, X-USER-VALIDATED, etc) to show that the submitter did do something that indicates that he agreed to the categories of submission. That is different than an email submission. I could think of other "X-tags" that could be useful to us or others.
1.
Conform to the principle that "We do not ever change/replace what the operator submits," we add new/derived/corrected information using internal Cabrillo tags and these internal tags [may] take precedence over submitted information during log processing.
Fred, yes is is hard to prove a "negative". CQP currently gives a confirmation number back to you that we can trace to your submission. That works with the e-mail site, but I'm not sure that it works with the web site.
When researching this myself, I couldn't find the submission confirmation number for N6O. To the best of my searching, I didn't get an email back from the website with the confirmation number.
73, Matt WX5S
I am actually planning on using an MD5 hash to identify all uploads to the server. Yes it can collide but that is a very very very very small chance. The has is only part of the saved file name... still working on that aspect... but we could use the MD5 hash as the submit ID for tracking. The nice thing about the MD5 hash is that if the same file gets uploaded multiple times, we have the option of not storing it a second time. All of this though is \just optimizations that are not that important at the beginning.. .. I am actually working on the actual log upload at this time. Hoping to get prototype done in the next week. This is a very simple solution atm.
I think that Fred, K6DGW got it close.
(1) all logs should conform to Cabrillo, ideally Cab 3. Conversions are possible from Cab 2 -> Cab 3.
(2) Our log acceptance SW produces Cab 3 logs for whatever contest. If you want to use our SW, then you have to define a Cabrillo spec that is compatible. Everything that you might say on a webform gets translated into this Cabrillo compliant spec. CQP can do that. Some extra tag could mean to us that "submitter agreed" and this Cabrillo file got submitted via a 2-way conversation vs a one-way e-mail.
(3) Internal log processing might add some "out of spec" tags:. I could think of a few appropriate ones. But all logs from our acceptance program wind up being Cabrillo 3 logs.
(4) All submitted logs get a confirm #. The current CQP process actually does that. Everything that a submitter says, even if it is "wrong" is already saved on multiple redundant servers.
(5) I don't think that MD5 checksum is necessary. But ok. This is commonly done and is fine.
73, Matt WX5S
On 4/24/2014 3:58 PM, wx5s wrote:
I think that Fred, K6DGW got it close.
(1) all logs should conform to Cabrillo, ideally Cab 3. Conversions are possible from Cab 2 -> Cab 3.
K6DGW: If this means we only accept Cabrillo 3 logs from CQP operators, please count me out. That's a disaster for CQP. If it means that we'll create Cabrillo 3 logs from a combination of what we get from the web form and the log the op submits, that's the right move.
(2) Our log acceptance SW produces Cab 3 logs for whatever contest. If you want to use our SW, then you have to define a Cabrillo spec that is compatible. Everything that you might say on a webform gets translated into this Cabrillo compliant spec. CQP can do that. Some extra tag could mean to us that "submitter agreed" and this Cabrillo file got submitted via a 2-way conversation vs a one-way e-mail.
(3) Internal log processing might add some "out of spec" tags:. I could think of a few appropriate ones. But all logs from our acceptance program wind up being Cabrillo 3 logs.
K6DGW: Matt sometimes confuses me. :-) It seems to me:
We gather all the information we need at log submittal time from the web form.
1a. If the answer to a web form question does not agree with the header in the submitted log, we check it out with the op ... right then, odds are high it's our only chance.
1b. Since Cabrillo logs are created by a multitude of computer programs, each doing so in it's own key, the "melody" isn't always clear. Therefore, I would place more value on the answers to the items on the web form at submittal time than on the contents of the Cabrillo header.
While defining a CQP Cabrillo 3 specification is important, and I think should be pursued, adoption/publication isn't going to happen for 2014 -- or maybe 2015, this is all operating on ART [Amateur Radio Time].
2a. So we define it, we ask for publication from the Powers of Cabrillo, and we do as Matt would like [I think] ... "All of the log acceptance processes yields a Cabrillo 3 CQP specification-compliant log to our specification.
2b. What we do with it after that is our business. GREEN currently adds coded stuff to the QSO lines. At least there's a precedent. :-)
(4) All submitted logs get a confirm #. The current CQP process actually does that. Everything that a submitter says, even if it is "wrong" is already saved on multiple redundant servers.
K6DGW: Redundant servers work so long as the processes [read that ... computer code] that saves it are not the same processes for each server. I can bore you with examples, some that have probably compromised national security, but I won't, you probably all have your own. It's a never-ending saga.
K6DGW: Re-submitted logs is a non-trivial problem. We can issue a confirmation number for the first submittal, obviously it needs to be unique. What if the op re-submits a log after he has corrected it? I hope he gets a new confirmation number and we can track the re-submission as well as the original in the audit trail.
(5) I don't think that MD5 checksum is necessary. But ok. This is commonly done and is fine.
K6DGW: Ummm ... so long as whatever implements it can deal with the collisions, I guess. This isn't a really comfortable idea however. MD5 is cryptographically 'strong' but all that means is, if your message is "ATTACK AT DAWN," it is very hard to conjure up a message with the same MD5 hash that someone will interpret to be "RETREAT IMMEDIATELY'. I don't know the size of the MD5 codespace, I'm very sure the number and length of the logs we receive would not challenge it, but things happen when you're in "Probability World." Keep in mind to, logs are basically ASCII-64, a much smaller subset of ASCII-256 -- lots higher probability of duplications. I really don't see why this adds anything but more complexity to the project, but it's probably going to happen.
73,
Fred K6DGW
see below
Rick "The Rhino" N6RNO @Tehama October 4-5, 2014 Where will you be?
On Thu, Apr 24, 2014 at 6:39 PM, Fred notifications@github.com wrote:
On 4/24/2014 3:58 PM, wx5s wrote:
I think that Fred, K6DGW got it close.
(1) all logs should conform to Cabrillo, ideally Cab 3. Conversions are possible from Cab 2 -> Cab 3.
K6DGW: If this means we only accept Cabrillo 3 logs from CQP operators, please count me out. That's a disaster for CQP. If it means that we'll create Cabrillo 3 logs from a combination of what we get from the web form and the log the op submits, that's the right move.
Server accepts only Cabrillo logs 2 or 3 or some compromise... there are a lot of compromises. It only outputs Cabrillo 3.0 for further processing.
(2) Our log acceptance SW produces Cab 3 logs for whatever contest. If you want to use our SW, then you have to define a Cabrillo spec that is compatible. Everything that you might say on a webform gets translated into this Cabrillo compliant spec. CQP can do that. Some extra tag could mean to us that "submitter agreed" and this Cabrillo file got submitted via a 2-way conversation vs a one-way e-mail.
(3) Internal log processing might add some "out of spec" tags:. I could think of a few appropriate ones. But all logs from our acceptance program wind up being Cabrillo 3 logs.
K6DGW: Matt sometimes confuses me. :-) It seems to me:
- We gather all the information we need at log submittal time from the web form.
We get the email and a log file from the user, if it is Cabrillo then we extract a first pass at the critical data onto a form that the user has to OK or correct. If it is not Cabrillo, then we have user resubmit a Cabrillo log (keeping the previous submission. Optional: We can ask user for all the header information even on a bad log.
1a. If the answer to a web form question does not agree with the header in the submitted log, we check it out with the op ... right then, odds are high it's our only chance.
1b. Since Cabrillo logs are created by a multitude of computer programs, each doing so in it's own key, the "melody" isn't always clear. Therefore, I would place more value on the answers to the items on the web form at submittal time than on the contents of the Cabrillo header.
- While defining a CQP Cabrillo 3 specification is important, and I think should be pursued, adoption/publication isn't going to happen for 2014 -- or maybe 2015, this is all operating on ART [Amateur Radio Time].
2a. So we define it, we ask for publication from the Powers of Cabrillo, and we do as Matt would like [I think] ... "All of the log acceptance processes yields a Cabrillo 3 CQP specification-compliant log to our specification.
2b. What we do with it after that is our business. GREEN currently adds coded stuff to the QSO lines. At least there's a precedent. :-)
(4) All submitted logs get a confirm #. The current CQP process actually does that. Everything that a submitter says, even if it is "wrong" is already saved on multiple redundant servers.
K6DGW: Redundant servers work so long as the processes [read that ... computer code] that saves it are not the same processes for each server. I can bore you with examples, some that have probably compromised national security, but I won't, you probably all have your own. It's a never-ending saga.
K6DGW: Re-submitted logs is a non-trivial problem. We can issue a confirmation number for the first submittal, obviously it needs to be unique. What if the op re-submits a log after he has corrected it? I hope he gets a new confirmation number and we can track the re-submission as well as the original in the audit trail.
All unique logs uploaded to the server are saved. Each has it's own MD5 HASH. .
(5) I don't think that MD5 checksum is necessary. But ok. This is commonly done and is fine.
K6DGW: Ummm ... so long as whatever implements it can deal with the collisions, I guess. This isn't a really comfortable idea however. MD5 is cryptographically 'strong' but all that means is, if your message is "ATTACK AT DAWN," it is very hard to conjure up a message with the same MD5 hash that someone will interpret to be "RETREAT IMMEDIATELY'. I don't know the size of the MD5 codespace, I'm very sure the number and length of the logs we receive would not challenge it, but things happen when you're in "Probability World." Keep in mind to, logs are basically ASCII-64, a much smaller subset of ASCII-256 -- lots higher probability of duplications. I really don't see why this adds anything but more complexity to the project, but it's probably going to happen.
MD5 is almost a free function in the web server (free as in we have no real code to write just call a function). And the utilities for file loading handle this as a matter of course.
MD5HASH Is 128 bits which is 3.4x10^38 unique values. There are known algorithmic ways to create collision but they are only really useful for use against security keys. For file signatures, when only ASCII is used, it is really hard to create a collision. Cabrillo and ADIF logs will not be likely to collide. Binary files are easier to hack into a collision state
73,
Fred K6DGW
- Northern California Contest Club
- CU in the 2014 Cal QSO Party 4-5 Oct 2014
- www.cqp.org
—
Reply to this email directly or view it on GitHubhttps://github.com/tepperly/QSO-Party/issues/3#issuecomment-41351649 .
Comments interleaved below.
On Thu, Apr 24, 2014 at 6:39 PM, Fred notifications@github.com wrote:
On 4/24/2014 3:58 PM, wx5s wrote:
I think that Fred, K6DGW got it close.
(1) all logs should conform to Cabrillo, ideally Cab 3. Conversions are possible from Cab 2 -> Cab 3.
K6DGW: If this means we only accept Cabrillo 3 logs from CQP operators, please count me out. That's a disaster for CQP. If it means that we'll create Cabrillo 3 logs from a combination of what we get from the web form and the log the op submits, that's the right move.
CQP operators can submit Cabrillo 2, and we'll probably except logs that don't even match the Cabrillo 2 spec. The goal is that what comes out of the log normalization process is Cabrillo 3 compliant.
(2) Our log acceptance SW produces Cab 3 logs for whatever contest. If you want to use our SW, then you have to define a Cabrillo spec that is compatible. Everything that you might say on a webform gets translated into this Cabrillo compliant spec. CQP can do that. Some extra tag could mean to us that "submitter agreed" and this Cabrillo file got submitted via a 2-way conversation vs a one-way e-mail.
(3) Internal log processing might add some "out of spec" tags:. I could think of a few appropriate ones. But all logs from our acceptance program wind up being Cabrillo 3 logs.
K6DGW: Matt sometimes confuses me. :-) It seems to me:
- We gather all the information we need at log submittal time from the web form.
1a. If the answer to a web form question does not agree with the header in the submitted log, we check it out with the op ... right then, odds are high it's our only chance.
Agreed.
1b. Since Cabrillo logs are created by a multitude of computer
programs, each doing so in it's own key, the "melody" isn't always clear. Therefore, I would place more value on the answers to the items on the web form at submittal time than on the contents of the Cabrillo header.
- While defining a CQP Cabrillo 3 specification is important, and I think should be pursued, adoption/publication isn't going to happen for 2014 -- or maybe 2015, this is all operating on ART [Amateur Radio Time].
I think it's important to publish what we want, so that contest software writer at least know what the target is. Right now, we never actually say the format we want.
2a. So we define it, we ask for publication from the Powers of Cabrillo, and we do as Matt would like [I think] ... "All of the log acceptance processes yields a Cabrillo 3 CQP specification-compliant log to our specification.
2b. What we do with it after that is our business. GREEN currently adds coded stuff to the QSO lines. At least there's a precedent. :-)
(4) All submitted logs get a confirm #. The current CQP process actually does that. Everything that a submitter says, even if it is "wrong" is already saved on multiple redundant servers.
K6DGW: Redundant servers work so long as the processes [read that ... computer code] that saves it are not the same processes for each server. I can bore you with examples, some that have probably compromised national security, but I won't, you probably all have your own. It's a never-ending saga.
K6DGW: Re-submitted logs is a non-trivial problem. We can issue a confirmation number for the first submittal, obviously it needs to be unique. What if the op re-submits a log after he has corrected it? I hope he gets a new confirmation number and we can track the re-submission as well as the original in the audit trail.
Each submission gets a unique confirmation number. Even submitting the exact same log twice should IMHO give a new confirmation number.
(5) I don't think that MD5 checksum is necessary. But ok. This is commonly done and is fine.
K6DGW: Ummm ... so long as whatever implements it can deal with the collisions, I guess. This isn't a really comfortable idea however. MD5 is cryptographically 'strong' but all that means is, if your message is "ATTACK AT DAWN," it is very hard to conjure up a message with the same MD5 hash that someone will interpret to be "RETREAT IMMEDIATELY'. I don't know the size of the MD5 codespace, I'm very sure the number and length of the logs we receive would not challenge it, but things happen when you're in "Probability World." Keep in mind to, logs are basically ASCII-64, a much smaller subset of ASCII-256 -- lots higher probability of duplications. I really don't see why this adds anything but more complexity to the project, but it's probably going to happen.
If MD5 isn't log enough for you, there is always SHA1. The odds of two files having the same MD-5 AND being valid Cabrillo files is vanishingly small.
I think the role of the MD-5 is for provenance.
Tom
73,
Fred K6DGW
- Northern California Contest Club
- CU in the 2014 Cal QSO Party 4-5 Oct 2014
- www.cqp.org
—
Reply to this email directly or view it on GitHubhttps://github.com/tepperly/QSO-Party/issues/3#issuecomment-41351649 .
More Matt comments interleaved....
On Fri, Apr 25, 2014 at 8:59 AM, Tom Epperly notifications@github.comwrote:
Comments interleaved below.
On Thu, Apr 24, 2014 at 6:39 PM, Fred notifications@github.com wrote:
On 4/24/2014 3:58 PM, wx5s wrote:
I think that Fred, K6DGW got it close.
(1) all logs should conform to Cabrillo, ideally Cab 3. Conversions are possible from Cab 2 -> Cab 3.
K6DGW: If this means we only accept Cabrillo 3 logs from CQP operators, please count me out. That's a disaster for CQP. If it means that we'll create Cabrillo 3 logs from a combination of what we get from the web form and the log the op submits, that's the right move.
CQP operators can submit Cabrillo 2, and we'll probably except logs that don't even match the Cabrillo 2 spec. The goal is that what comes out of the log normalization process is Cabrillo 3 compliant.
Matt: Tom has is right. We take whatever we get and normalize it to Cabrillo 3. That is done so that all the software that does further work with the logs sees a standardized thing. Many folks maybe even 50% are still using ancient logger programs that generate Cabrillo 2. That's ok and we will co-exist with that.
This Normalization is actually one of the first steps when you submit a log to any ARRL or CQ contest. You as the submitter are not aware that is happening, but it is. This is a good way and is "battle tested".
(2) Our log acceptance SW produces Cab 3 logs for whatever contest. If you want to use our SW, then you have to define a Cabrillo spec that is compatible. Everything that you might say on a webform gets translated into this Cabrillo compliant spec. CQP can do that. Some extra tag could mean to us that "submitter agreed" and this Cabrillo file got submitted via a 2-way conversation vs a one-way e-mail.
(3) Internal log processing might add some "out of spec" tags:. I could think of a few appropriate ones. But all logs from our acceptance program wind up being Cabrillo 3 logs.
K6DGW: Matt sometimes confuses me. :-) It seems to me:
Sorry Fred ;-)
- We gather all the information we need at log submittal time from the web form.
1a. If the answer to a web form question does not agree with the header in the submitted log, we check it out with the op ... right then, odds are high it's our only chance.
Agreed.
Matt: Yes, agreed.
Actually many folks will abandon the log submittal process after we "yell at them". For example, some folks who are using TR will find out that our software won't like it when they don't put "DX" for the rQTH for a DX station (TR can't do that). Some folks will just say screw it. Right now we fix all of these logs and will continue to do that. However some folks will fix it for us if we tell them what we want and what is wrong with the log. That will mean less total work for us (CQP). If we don't tell them what we need, then nobody has a chance to do it!
We will get logs from various sources (website, email, etc). I was thinking of a tag like: "X-Cab3Validated: true" or something like that for our internal CQP use. The other SW guys will see the need for this. The reason is that the ABSENCE of some fields actually means something in Cab 3. For example: YL. you are not a YL if that is missing from CATEGORY-OVERLAY:. There is no YL yes/no flag, some kind of flag that told us that "yes, this is a validated header", would enable us to proceed with the knowledge that "yes, this is not a YL" ("yes, I have no bananas") . If I get a log via e-mail, the YL info might be in the SOAPBOX: instead of ideally where we would like it to go. Some kind of "flag" that told us, this Cab 3 header is "right" or has been adjusted post-submittal would seem appropriate to me. If we get a log via the website, that extra "out of spec" Cab line would be present, submitter agreed that Header info was "right".
In my opinion, for a generalized contest log acceptor, we should not be defining a whole bunch of extra flags in the log - as few as possible is the right way. Like all SW specs, Cabrillo has some things that could have been done differently. We are swimming upstream if we try to change any of the fundamentals. We should "go with the flow" and add as little as possible and in a way that is completely compliant.
1b. Since Cabrillo logs are created by a multitude of computer
programs, each doing so in it's own key, the "melody" isn't always clear. Therefore, I would place more value on the answers to the items on the web form at submittal time than on the contents of the Cabrillo header.
That is the reason for my suggested "Validated" out of spec tag.
- While defining a CQP Cabrillo 3 specification is important, and I think should be pursued, adoption/publication isn't going to happen for 2014 -- or maybe 2015, this is all operating on ART [Amateur Radio Time].
I think it's important to publish what we want, so that contest software writer at least know what the target is. Right now, we never actually say the format we want.
Matt: Yes, Tom has it right again. Getting "full acceptance" of our Cab 3 spec is NOT on the critical project path. This Cab 3 spec is BOTH a requirements doc for our internal software and also a spec for the outside. Over time, we will actually start getting logs that are 100% compliant with what we ideally want. We will always have to massage some logs into that CQP spec. That is just a fact of life. Right now we get Zero logs with exactly what we want because we've never said what that exactly is.
2a. So we define it, we ask for publication from the Powers of Cabrillo, and we do as Matt would like [I think] ... "All of the log acceptance processes yields a Cabrillo 3 CQP specification-compliant log to our specification.
2b. What we do with it after that is our business. GREEN currently adds coded stuff to the QSO lines. At least there's a precedent. :-)
(4) All submitted logs get a confirm #. The current CQP process actually does that. Everything that a submitter says, even if it is "wrong" is already saved on multiple redundant servers.
K6DGW: Redundant servers work so long as the processes [read that ... computer code] that saves it are not the same processes for each server. I can bore you with examples, some that have probably compromised national security, but I won't, you probably all have your own. It's a never-ending saga.
K6DGW: Re-submitted logs is a non-trivial problem. We can issue a confirmation number for the first submittal, obviously it needs to be unique. What if the op re-submits a log after he has corrected it? I hope he gets a new confirmation number and we can track the re-submission as well as the original in the audit trail.
Each submission gets a unique confirmation number. Even submitting the exact same log twice should IMHO give a new confirmation number.
Matt: Again yes. We don't loose anything and every confirm number can be traced back to a particular log version that was submitted. That is actually true now.
One "hole" is that some mobile stations will submit logs for their various QTH's in a way that our current software "misses". One guy made 5 log submissions. 2 of which were for sent QTH A and 3 of which were for QTH B. This wound up looking like 5 logs were submitted for QTH B. We did figure this out during log check (hey, were are the QTH A QSO's). We used the most recent submissions for QTH A and QTH B. This wasn't automatically detected, but we did figure it out and the archive of all log submissions help us do that. CQP is aware of this issue.
(5) I don't think that MD5 checksum is necessary. But ok. This is commonly done and is fine.
K6DGW: Ummm ... so long as whatever implements it can deal with the collisions, I guess. This isn't a really comfortable idea however. MD5 is cryptographically 'strong' but all that means is, if your message is "ATTACK AT DAWN," it is very hard to conjure up a message with the same MD5 hash that someone will interpret to be "RETREAT IMMEDIATELY'. I don't know the size of the MD5 codespace, I'm very sure the number and length of the logs we receive would not challenge it, but things happen when you're in "Probability World." Keep in mind to, logs are basically ASCII-64, a much smaller subset of ASCII-256 -- lots higher probability of duplications. I really don't see why this adds anything but more complexity to the project, but it's probably going to happen.
If MD5 isn't log enough for you, there is always SHA1. The odds of two files having the same MD-5 AND being valid Cabrillo files is vanishingly small.
I think the role of the MD-5 is for provenance.
Matt: There has never been a recorded/documented case where normal FTP protocol has failed to reliably transmit the log files. I think with line ending issues, this is more trouble than its worth. I recommend against any extra check-sums like MD5 or whatever.
Tom
73,
Fred K6DGW
- Northern California Contest Club
- CU in the 2014 Cal QSO Party 4-5 Oct 2014
- www.cqp.org
I would be happy to work on the CQP Cabrillo 3 spec. I've talked with N6TV and N5KO about this over the years.
I think that it is possible for CQP entries to be described accurately within the existing Cabrillo specs and I would be happy to work on that issue.
Be aware that it will take years, perhaps many years for logging authors to implement this (if ever). However I think that it would be wise for us (CQP) to publish such a spec and to use it for our log acceptance and our generated Cabrillo files.
There are 2 main issues: (1) At its heart, Cabrillo is a white space separated format. There are not really any "column" definitions for QSO lines. The QSO templates are pretty much fiction and the normal QSO: parsers do not use these templates - they just separate the tokens based upon white-space. This is what drove the CQP change about county lines years ago. Received QTH of "SCLA SMAT" won't work because that is viewed as 2 tokens, not one! Trying to "heard these ducks into a row" just wouldn't work. Trying to get log authors to put: "SCLA/SMAT" into the log when the users enters "SCLA SMAT" was not judged as feasible and how the log authors keep track of mults and understand this multi-mult QTH is hard. This decision of course could be re-visited. Heck anything is theoretically possible. However, I don't think that would be a good idea and is "above my pay grade".
(2) CQP has special subcategories (like YL, etc). There are ways to do this within the standard Cabrillo tags. I think that we should do that. There is for example: CATEGORY-OVERLAY: The tag, "CATEGORY-OVERLAY:" is owned by N6TV, but we could put "YL" into the options allowed for that. Getting some new tag is a "very heavy lift". Adding some new contest specific option of "YL" is much easier and is within the philosophy of Cabrillo V3. That is the "right place" for this option to go.
Currently this CQP sub-category of YL is done via what is called "out of band signalling", i.e. in the SOAPBOX: tag's text. This is not the right way to do this. Neither is some extra tag like "X-CQP: YL".
If we make a generalized log submission gizmo, we should do it within Cabrillo guidelines. In this case, "hey, this is such a thing as YL and that basically is a CATEGORY-OVERLAY:". When we parse the incoming log, we should report your YL status (Y/N) and give you a chance/directions on how to update that.