davidskalinder / mpeds-coder

MPEDS Annotation Interface
MIT License
0 stars 0 forks source link

Missing article info for some events #86

Closed davidskalinder closed 4 years ago

davidskalinder commented 4 years ago

There are some rows in the .csv that lack an internal article ID and a value in db_id. Does this make sense that there'd be such cases? I think they'd be useless in Pass 2, and I can readily avoid loading them into Access, so no big deal on my end of the process.

Originally posted by @johnklemke in https://github.com/davidskalinder/mpeds-coder/issues/83#issuecomment-633066208

davidskalinder commented 4 years ago

More detail from (@olderwoman's?) meeting notes:

there are 42 event numbers with no article attached. Also no coder. But marked text and responses. This should never happen. There is some sort of serious problem here.

davidskalinder commented 4 years ago

Okay, 99% sure this is because I'm doing the joins wrong. I think these are cases where a coder annotates stuff at the event level but not at the article level. I join the article stuff onto the event and drop the article IDs, which is fine when the article level coding is present but if not then those rows have no IDs to join the metadata onto.

Wouldn't it be just swell if pandas automatically coalesced the joined IDs together? And if it did it back in version 0.16.2? If not, well, there's gotta be a coalesce function in there somewhere...

davidskalinder commented 4 years ago

Should be fixed in 780fd02. Need to merge, test and deploy...

davidskalinder commented 4 years ago

Merged into testing and master and deployed on all three installations. The CSVs look good to me.

@johnlemke and @olderwoman, there should be a new file with this change in it for you to review in gdelt/Skalinder/MAI_exports. I moved the previous file (from #84) to the old subdirectory -- the new file should include that change as well as the stuff from this issue.

It looks to me like now every line has a user ID and article metadata, though note that:

I think all of that is what we expect, so I'll move this issue to the testing column for now; let me know if this problem looks solved and I'll close it.

davidskalinder commented 4 years ago

I'm going to quote from what I just posted in #82, since I think the exact same situation applies here:

So @johnklemke, I think this issue in the MAI-to-pass-2 handover file is fixed, but I can't remember whether you've done an import to pass 2 since May 26 so I can't 100% confirm that. However this issue has been in testing for a while now, so I'm going to close it for now as part of the issue-tidying connected to #107. Feel free to reopen if it turns out to cause any problems.