Closed lvarnedoe closed 2 years ago
Yes, and we can remove the duplicates. Just change the UNION ALL to just UNION. That should remove duplicates.
Doing a UNION instead of a UNION ALL reduces the number of rows returned, but there are still rows with repeated addresses. This is because one row might have the county and the other similar row does not.
Although it looks like duplication, it's really not. One record is location at DX (CNEXT TUMOR) and the other is DX current (CNEXT PATEXTENDED). And although most of the data is the same, it's not as it pertains to different locations in NAACCR.
addressed by Union queries in cnext_location instead of Union All #61
The cnext_location script is pull a patient's address at the time of diagnosis and the patient's current address. These addresses can be the same. Per the OMOP CDM "Each address or Location is unique and is present only once in the table." Not sure we should pull the address twice when the 2 addresses are the same.