usc-isi-i2 / t2wml

Table to Wikidata Mapping Language
MIT License
22 stars 11 forks source link

Interesting bug with uploading annotations from a previous `.t2wmlz` project file #579

Closed g1eb closed 3 years ago

g1eb commented 3 years ago

To reproduce this issue upload a regular file first, add annotations and save the results by downloading the .t2wmlz file.

Step 2 is to upload the same original file again, select Upload Annotations from the project menu in the top-left corner, now drag-and-drop (or click and upload) the .t2wmlz project file from the previous step.

The result is that the annotations have exactly 3 more annotated cells than before.

(PS. Uploading the saved .t2wmlz file straight up from the start works just fine, this issue occurs only when you select the Upload Annotations option from the project menu)

Here is the original .xlsx file that I was testing with: annotation_example.xlsx

g1eb commented 3 years ago

Here's a picture of what the result looks like with the additional empty cells added to the annotation blocks:

Screen Shot 2021-08-02 at 5 09 46 PM

devowit commented 3 years ago

@abhinav-kumar-thakur

abhinav-kumar-thakur commented 3 years ago

I am looking at this issue. @g1eb could you also share the annotation file with me. I will use the same for my testing. Till then, I will create my own annotations and try to reproduce the bug.

g1eb commented 3 years ago

@abhinav-kumar-thakur would it be possible to remove/trim the empty lines at the end of the new annotations in the picture?

annotation example.zip

Here is the zip folder of both original input file and the saved .t2wmlz project file that I used to upload/copy the annotations from. You can try this on our live demo https://t2wml.isi.edu/ by uploading the original file first and then uploading the annotations via the project menu in the top left of the screen.

Screen Shot 2021-08-03 at 12 47 04 PM

g1eb commented 3 years ago

FYI we are going to rename the Upload Annotations button to Apply Annotations in #582

abhinav-kumar-thakur commented 3 years ago

Okay @g1eb, As per the implementation of copy_annotations it should ignore any row/column which is completely empty. Let me check and get back on why this issue is happening.

g1eb commented 3 years ago

thank you!

abhinav-kumar-thakur commented 3 years ago

copy_annotation function expects both the source and target Dataframe to have the empty cells represented by NaN. So, that it can ignore them while copying the annotations.

While debugging I found that the expectation is met by the source_df argument to thecopy_annotation function. image

But, the target_df has the empty cells as a blank string i.e. "". image

We can make target_df as same as source_df to fix this issue.

g1eb commented 3 years ago

Updated on our live demo to include this PR ^ https://t2wml.isi.edu/