Closed howew closed 2 years ago
It looks like the library is now pre-pending the zone with the state fips code and this is what's causing it to go over the limit.
It appears that the SEER data is being used to populate the CensusData for this field. The old resource file still exists in the library for some reason, but that's actually good in this case because you can clearly see that it did not used to include the state fips code in the zone value.
The old data files still exists in the project, but they are under the "test" package and not released. I kept them around for this exact purpose: tracking problems.
I have in my notes that we discussed this with the IMS group (the fact that the SEER data has the state in the code) and that we agreed to start using those new codes as-is (with the state) and that we would notify Recinda. I supposed that never happened.
It seems to me the path of least resistance is to change the library to strip those leading codes. But that means the output of the algorithm won't be the same as what is contained in the SEER data, meaning it might be the same output as other software using that data. I think that was the argument for trying to keep it as-is.
I think we should just strip the codes though, changing the dictionaries is a much more intrusive process!
I agree.
This is fixed. I am going to a release shortly.
In Algorithms.java cancer reporting zone is defined as 8-characters wide but there are 9 character wide (and possibly wider) values being output by the algorithm.
We need to confirm the maximum width of this field and make adjustments to the library and any external user-defined dictionaries.