Closed howew closed 4 years ago
This is obviously a trivial fix.
How badly is this needed? Does this cause some immediate issues in NAACCR*Prep?
It's low priority. I only spotted it because I based the NAACCRPrep dictionary on this documentation, so when NAACCRPrep was outputting a fixed-width record with this field it was wrong (values were shifted by 2 characters). I fixed the dictionary in Prep and it solved the issue.
Sounds good. I will put a fix soon, but won't push a release yet.
I am not seeing this.
IarcUtils.calculateHistGroup() returns a value between 1 and 17.
Is the API calling the right function?
I think so.
This algorithm outputs two histology fields: "histology" (4 characters) and "histology group" (2 characters).
I reviewed the code in Algorithms and in IarcUtils and I don't see anything suspicious.
Let me double check - I know that I was getting 4 characters back and when I updated the dictionary it solved the issue with the shifting.
Sorry, I found it.
That method default to returning the full histology if the 1-17 recoding doesn't work.
That does look super suspicious to me; sometimes it returns a legit 4-characters histology and sometimes it returns a 1 to 2 digit recoded code.
I would love to go back to the original SAS program and double-check that.
But as things are now, you are correct, that needs to be a 4 digits code.
I will fix that.
OK sounds good.
@howew I went back to the original SAS program that used to compute this variable, and it didn't default the group to the full histology.
I believe this is a bug in the Java implementation.
If the full histology can't be coded into a group, then the method should return a null group.
That's the fix I will implement, instead of changing the length of the field.
Do you think it should be fixed now or wait? I told Recinda it could wait but I can backtrack if you want.
I would rather wait. This doesn't feel like a very urgent issue to me.
The fix will be done soon regardless (like today or tomorrow) because it's trivial. But I wasn't going to push a full release just for this.
But I can if it causes any real issues in other software.
I think you can leave your temp fix in for now (using 4 characters). And the when you upgrade the library with this fix, you will have to undo that temp fix. If that works for you.
Yes that's fine. I'll put a squish issue into the CFD squish project so that I don't forget.
I found another minor issue while looking at the code vs SAS. The site group is supposed to be the three first digits of the primary site. And the method that does that site group always returns 3 characters. But the site group field is defined as a 4 characters field. I am going to change it to 3 characters to make it clear it can only hold the 3 first characters of the primary site.
OK. Noted.
This is completed. It will be available the next time a release is pushed (that will be version 2.19).
In Algorithms.java, a 2 is entered line below but the algorithm produces a field that is 4 characters wide. The 2 should therefore be a 4.
addField(AlgorithmField.of(FIELD_IARC_MP_HIST_GROUP, null, 2, "IARC Multiple Primary Histology Group", "IARC MP Hist Grp", DATA_LEVEL_TUMOR));