Open AppyKul opened 1 month ago
Can you please check if you are saving the labels correctly. Please check this video if it helps.
Hi Sonal Yes I did.. Please find the screenshots attached. There is an issue at the line if start > 0: start += len('data-title="') end = pair.children[0].value.find('"', start+2) pair_id = pair.children[0].value[start:end] ---> issue at this line as end is not recognized the variable end is not recognized as it is not entering if clause.
start = pair.children[0].value.find('data-title="') start value is -1. So clearly it is not picking something?
Also tried printing the contents of pair.children[0] It is as below : HTML(value='z_cluster1719220687105:01719220687105:0custid1010410103fname joshua joshuslname george georgestNo77add1 jelbart street jelbart streetadd2 city hyam sbeach hyams beachareacode vic vicstate40324032dob1945022019450220ssn43322624332262')
Please help.
do you see the widget properly? is the record getting parsed and columns getting displayed properly?
Apologies for not attaching screenshots . Please find the screenshots attached.
The widgets do not have records displacyed in the way it is displayed inreference video.
ok so it seems that the data itself i not getting parsed or viewed correctly. can you please check the settings for your input pipes?
Its the same as in the example code : schema = "id string, fname string, lname string, stNo string, add1 string, add2 string, city string, state string, areacode string, dob string, ssn string" inputPipe = CsvPipe("testFebrl", "/FileStore/tables/test_1.csv", schema)
args.setData(inputPipe)
test_1.csv is a subset of the test.csv in example with just 100 records
can you do a normal pyspark read on the data to see if you can read it properly?
Yes, please find the screenshot of a simple dataframe reading from csv:
Data appears in tabular format
Hi @sonalgoyal , my free AWS trial ends in 9 days. Can you please help on this?
can you try passing the above df as InMemoryPipe and see if that displays correctly? I am afraid we may be hitting a bug if that doesnt work. Please log the browser version and dbr information to the issue if it is not resolved.
Also can you please try with the exact test file we supply?
Hi,
After the interactive labeler phase, when the below code runs : print(f'You have accumulated {n_pos} pairs labeled as positive matches.') print(f'You have accumulated {n_neg} pairs labeled as not matches.')
The count is incorrectly displayed.
The line start = pair.children[0].value.find('data-title="') is where the issue probably is as the value for start is -1.
Can you please tell what exactly happens in the data-title section?
The value of pair.children[0] is :
HTML(value='z_cluster1719220687105:01719220687105:0custid1010410103fname joshua joshuslname george georgestNo77add1 jelbart street jelbart streetadd2 city hyam sbeach hyams beachareacode vic vicstate40324032dob1945022019450220ssn43322624332262')