imsweb / algorithms

Java implementation of cancer-related algorithms (NHIA, NAPIIA, Survival Time, etc...)
Other
6 stars 6 forks source link

Incorrect width for Cancer Reporting Zone #150

Closed howew closed 2 years ago

howew commented 2 years ago

In Algorithms.java cancer reporting zone is defined as 8-characters wide but there are 9 character wide (and possibly wider) values being output by the algorithm.

We need to confirm the maximum width of this field and make adjustments to the library and any external user-defined dictionaries.

howew commented 2 years ago

It looks like the library is now pre-pending the zone with the state fips code and this is what's causing it to go over the limit.

howew commented 2 years ago

It appears that the SEER data is being used to populate the CensusData for this field. The old resource file still exists in the library for some reason, but that's actually good in this case because you can clearly see that it did not used to include the state fips code in the zone value.

depryf commented 2 years ago

The old data files still exists in the project, but they are under the "test" package and not released. I kept them around for this exact purpose: tracking problems.

I have in my notes that we discussed this with the IMS group (the fact that the SEER data has the state in the code) and that we agreed to start using those new codes as-is (with the state) and that we would notify Recinda. I supposed that never happened.

It seems to me the path of least resistance is to change the library to strip those leading codes. But that means the output of the algorithm won't be the same as what is contained in the SEER data, meaning it might be the same output as other software using that data. I think that was the argument for trying to keep it as-is.

I think we should just strip the codes though, changing the dictionaries is a much more intrusive process!

howew commented 2 years ago

I agree.

depryf commented 2 years ago

This is fixed. I am going to a release shortly.