CancerRegistryOfNorway / nordcanpreprocessing

Other
0 stars 0 forks source link

Update entities #30

Closed CotterpinDoozer closed 2 years ago

CotterpinDoozer commented 3 years ago

We wish to include the old entity 400 (Leukaemias - C91-C95). This would call for a change to the existing entity layout.

HuidongTian commented 3 years ago

The entity 400 seems not added. See the link: https://github.com/CancerRegistryOfNorway/NORDCAN/wiki/Specification-Entities

CotterpinDoozer commented 3 years ago

I know. It was in the old NORDCAN database, but we initially did not include it in the new version.

To be able to add it to the current structure of NORDCAN, we either need to include a level_14 or we need to redo the whole structure, maybe not have hierarchy at all - just define topography/morphology codes to each entity and not thinking about hierarchical structures. I am not sure what is best and gives us the most flexibility in the future?

CotterpinDoozer commented 2 years ago

We should probably also remove some of the new lymphoma/leukaemia groups which we will probably never use. Siri will discuss with Huidong the best way to handle this (leave them in the program or update, still keeping room for additional entities that might come later)

CotterpinDoozer commented 2 years ago

From this site, a new version of the ICD-10 --> Entity-table can be downloaded: https://elvis.kreftregisteret.no/tables/view/5 The new version should replace the old which is here: https://github.com/CancerRegistryOfNorway/NORDCAN/blob/master/specifications/icd10_to_entity_columns.csv

As I have made a lot of changes to this file and moved some entities between levels and deleted a lot of entities, we need to check the r-code thoroughly to see if the different levels are used in any way that can be influenced by the changes I have done.

HuidongTian commented 2 years ago

updated. https://github.com/CancerRegistryOfNorway/NORDCAN/blob/master/specifications/icd10_to_entity_columns.csv

CotterpinDoozer commented 2 years ago

I checked the entities that came out from the last version of nordcan.R. There were a few too many (445, 449, 452, 453, 454). This was due to an error in this table: https://elvis.kreftregisteret.no/tables/view/5. I have corrected the errors, so we need to update this again: https://github.com/CancerRegistryOfNorway/NORDCAN/blob/master/specifications/icd10_to_entity_columns.csv

However: there are no cases for the new entities 105 and 400, even though they are included in the table, and I know that there should be cases for this. Possibly, there is some other place in the code something needs to be updated as well to get this to work. Can you check?

CotterpinDoozer commented 2 years ago

Actually - I can see in the cancer_record_dataset that both entity 105 and entity 400 exist on several records, but it seems they are not exported to the statistics tables, so that's probably where we need to look for the error.

CotterpinDoozer commented 2 years ago

It was missing an update of the table https://github.com/CancerRegistryOfNorway/NORDCAN/blob/master/specifications/entity_usage_info.csv. This is now fixed and 105 and 400 is included in the output-files for mortality, incidence and prevalence. It is not included in survival. I will follow up on this with Bjarte.

105 and 400 is not included in the graphic comparison files. We should double check that this is just because they weren't included in the previous version or if it is something else we need to update.

CotterpinDoozer commented 2 years ago

Ok for 9.2-version.