Closed scotthibbs closed 8 months ago
Hi Scott, interesting observation. I have not worked a FD simulation run in a while. I did update the call history file at the end of January here.
I pulled some data from the current and the prior call history files. I then added a few percentage number for the top three...
class | current version | prior version |
---|---|---|
Total entries | 14423 | 13899 |
1D | 7524 (52.2%) | 7133 (51.3%) |
1E | 2299 (15.9%) | 2218 (16%) |
1B | 1546 (10.7%) | 1397 (10.1%) |
2A | 613 | 687 |
3A | 609 | 594 |
1A | 323 | 348 |
4A | 274 | 247 |
2D | 183 | 188 |
1C | 179 | 165 |
5A | 133 | 150 |
2E | 133 | 141 |
2B | 114 | 109 |
2F | 99 | 123 |
3F | 59 | 70 |
6A | 57 | 53 |
1F | 48 | 57 |
3D | 45 | 48 |
4F | 36 | 24 |
3E | 32 | 35 |
7A | 22 | 17 |
It appears we have always had a high percentage of 1D Stations and didn't notice it before. I notice in the new file, FDGOTA.txt
it says at the top of the file:
FDGOTA is the new name/callhistory for both FD and FDGOTA.
It appears that the N1MM team, and those that support the FD Call History files, have combined the GOTA and FD call history files into a single file. Perhaps this action added some extra 1D stations - well that doesn't make sense because 1D are home stations. I don't get it. Perhaps this shows how many home stations participate in FD.
According to the published ARRL FD results, 53% of the entries were Home stations (D & E). The article said:
Home station Class D and E entries dropped from 58.5% in 2022 to 53.8% in 2023. This shift suggests that 5.2% of this year’s station participants who were compelled to stay home in 2022 returned to group participation in 2023.
Apparently there has always been a high percentage of home stations and we never noticed it before.
Do we need to consider removing some of these home stations from the simulation? This would give us a more even distribution of station classifications during a simulation run. We have not done this in the past.
Isn't it also true under current rules that 1D home stations can contribute their score to their club's score? Meaning, club members can operate from home and still contribute their score to the club. something like that anyway.
I also think that the N1MM call history files contain data from 2-3 years prior. This allows an operator to skip a year and still have their callsign in the call history file. This can also skew the results we are seeing.
What I see in the logs is that for almost any 1D call with a club name listed, in most cases that same club name will have a non-home station entry. Here is an extreme case...
AC9NQ,1D,IL,Dupage ARC
K9EW,1E,IL,Dupage ARC
K9FEH,1D,IL,Dupage ARC
K9LEZ,1D,IL,Dupage ARC
K9ROM,1D,IL,Dupage ARC
KA9BHD,1D,IL,Dupage ARC
KA9P,1B,IL,Dupage ARC
KC9JBU,1D,IL,Dupage ARC
KD9MQV,2D,IL,Dupage ARC
KE9MC,1D,IL,Dupage ARC
KG9R,1D,IL,Dupage ARC
N8NIX,1D,IL,Dupage ARC
N9DMS,1D,IL,Dupage ARC
N9IZU,1D,IL,Dupage ARC
N9NWA,1D,IL,Dupage ARC
N9QGV,1D,IL,Dupage ARC
N9SXF,1D,IL,Dupage ARC
W9DUP,5A,IL,Dupage ARC
W9TAM,2D,IL,Dupage ARC
W9ZV,1D,IL,Dupage ARC
WB9RCE,1D,IL,Dupage ARC
WB9TRJ,1D,IL,Dupage ARC
WB9UGX,1D,IL,Dupage ARC
WB9VRQ,1D,IL,Dupage ARC
WB9WOZ,2D,IL,Dupage ARC
You can see that the last entry was 2D, so 2 transmitters active. The call W9DUP is the club entry using 5A with an additional 24 home stations.
So perhaps our algorithm is to eliminate all home stations with a club name associated with them. (Meaning the UserText field must be empty). Or a more advanced algorithm with be to collect all the club names and eliminate all home stations using this same club name. The club name must have one non-home entry. This way we keep only the call associated with the club name, but still allows some home stations to exist in the simulation.
So out of the 7524 1D stations, 1847 of them have an empty UserText field (meaning we can assume they have no listed club affiliation). Perhaps this will work, only pull out the home stations (D or E) having no club affiliation. This would be simple to code up.
Actually, you may have a point. These could be clubs combining their calls with people staying at home which would be from two years ago. We may not need to fix anything if this is the case. It's ok to close this one or alter if you feel it is necessary. I had forgotten about the 1D rule from Covid.
I was going to put in an issue for the updated W/VE sections but I may not need too. For example "GH" no longer exists but I got that in my contest. But it's pulling from previous data so it would be valid. Do we need to date the contests? Or maybe reference in the readme in the contest section which year the data is from?
If we eliminate the 'B', 'D' and 'E' class stations that have a club name, we get the following which is closer to what the FD Results have published for total percentages of home stations. I noticed that 'B' stations can also have a club name and an associated club. These operators when portable with 1 or 2 operators as 1B or 2B. However, we still end up with ~54% in the top 3 classes. This is probably better than ignoring it completely. It's late, I need to get to sleep.
Update: I just noticed that the total entries dropped from 14423 to 5840 by eliminating the home/battery stations that are associated with a club. That's a huge reduction and certainly provides a better simulation by using club names that are active and on the air each year.
Class | current version |
---|---|
total entries | 5840 |
1D | 1847 (31.6%) |
1B | 687 (11.8%) |
1E | 658 (11.3%) |
2A | 613 (10.5%) |
3A | 609 (10.4%) |
1A | 323 (5.5%) |
4A | 274 (4.7%) |
1C | 179 (3.1%) |
5A | 133 (2.3%) |
2F | 99 (1.7%) |
3F | 59 (1.0%) |
6A | 57 (1.0%) |
2B | 55 (0.9%) |
2D | 52 (0.9%) |
1F | 48 (0.8%) |
4F | 36 (0.6%) |
2E | 27 (0.5%) |
7A | 22 (0.4%) |
8A | 13 (0.2%) |
5F | 9 (0.2%) |
2C | 9 (0.2%) |
9A | 7 (0.1%) |
3D | 5 (0.1%) |
6F | 4 (0.1%) |
3E | 4 (0.1%) |
9F | 3 (0.1%) |
8F | 2 (0.0%) |
7F | 2 (0.0%) |
5E | 1 (0.0%) |
4E | 1 (0.0%) |
4C | 1 (0.0%) |
3B | 1 (0.0%) |
Hi Scott, you mentioned...
I was going to put in an issue for the updated W/VE sections but I may not need too. For example "GH" no longer exists but I got that in my contest. But it's pulling from previous data so it would be valid. Do we need to date the contests? Or maybe reference in the readme in the contest section which year the data is from?
Good question. The code is not testing/checking for correct sections. MRCE sends what is in the call history file and we compare what the user enters in the Exchange box against what was sent. I think we are okay on this, even if the call history file has some old data. I suppose we should filter out (ignore) lines with old section names.
concerning dates... The call history file header usually has a date. But you are asking something different. My guess is it is not necessary. The user wants to practice. We should keep history files current if we are aware of changes (like new sections). If you want to add an appendix to the Readme.txt file, feel free to do so. We will have to remember to update this whenever we update the call history files.
I downloaded this database for 2022 FD. I then sorted all the lines by club name and looked around for a while. I noticed...
Add Issue #286 to provide short-term fix to this issue. We will keep this Issue open to address this with a better algorithm that will keep any true 1D/1B stations that are operating as the only station for a given club name. Issue #286 is removing all home/portable stations with a club name. This will be revisited by this Issue at a later time.
This has been fixed with latest change to #286. I am marking this as fixed - ready for validation
. I realized I should have used this issue, not #286, with this recent change. Sorry about that.
Below is a copy of the comment added to #286. I wanted to copy it here since it applied to the discussion above.
I have finished my rework of this. This new solution eliminates all home/portable stations that are associated with a club and keeps 25% of all home/portable stations that are not associated with a club. Keeping 25% of these gives a good distribution without feeling like home/portable (1D/1E) stations are dominating the contest. During FD, the club stations are dominating the QSO count, so reducing these 1D/1E brings the distribution into something more reasonable for a FD simulation. Below is the imported call history distribution...
ARRL FD - Call History Distribution by Classification (20 calls/dot) 1A: 270 ************* 2A: 568 **************************** 3A: 572 **************************** 4A: 259 ************ 5A: 127 ****** 6A: 54 ** 7A: 22 * 8A: 13 9A: 6 10A: 2 11A: 4 14A: 1 17A: 2 1B: 274 ************* 2B: 44 ** 3B: 2 4B: 1 1C: 105 ***** 2C: 7 4C: 1 1D: 612 ****************************** 2D: 32 * 3D: 7 4D: 2 6D: 1 1E: 214 ********** 2E: 17 3E: 6 4E: 4 5E: 1 1F: 39 * 2F: 92 **** 3F: 57 ** 4F: 35 * 5F: 9 6F: 4 7F: 2 8F: 2 9F: 2
This is my last planned change for 1.84. I'll post a final release-candidate build for final testing before release.
Way better distribution! TU
Description
I just worked a FD contest and a large majority of the clubs that responded were 1D (which is a single op home location). The vast majority of these clubs would be multi operator( a random number) alpha (in the field) with the most common 3A category.
Steps To Reproduce
Expected behavior
Most clubs would be multiple operator at a field location, not one person working from home (1D)
Actual Behavior
Many clubs were 1D.
Reproduces how often
I would say 90% of the club calls that responded gave me 1D in the exchange.
Version information
Additional context
(This can wait to be fixed for 1.84)
Can you help?
Please let us know if you are available to help. (replace '[ ]' with '[x]' to affirm)