Closed ghost closed 8 years ago
yes?
why i get this error?
Records do not line up with data model. The field 'website' is in data_model but not in a record
i don't understand what this mean
Your said that 'website' was a field that you wanted to compare but there is no 'website' field in your record.
i don't understand what i do wrong
csvlink file1.csv file2.csv --config_file config.json
i get
ValueError: Records do not line up with data model. The field 'fax' is in data_model but not in a record
file1.csv
id,geo_latitude,geo_longitude,star_rating_value,name,city,country,chain_name,type,address,fax,postal_code,email,website,booking_phone,management_phone,hotel_phone
107,44.457973,26.091842,,minerva,bucharest,romania,,hotel,street gheorghe manu number 2-4 sector- 1 010445 romania,40213123963,010445,reservation@minerva.ro,www.minerva.ro,0040213181294,0040213122738,+40213111555
108,44.435918,26.094242,,opera,bucharest,romania,,hotel,brezoianu street no 37 sector 1 bucharest romania,0040213124858,010132,info@hotelopera.ro,,,,0040213124857
118,54.595541,-5.933663,3,belfast central travelodge,belfast,united kingdom (great britain),travelodge,hotel,15 brunswick street belfast bt2 7ge united kingdom,441232232999,bt2 7ge,valerie.steinbeck@travelodge.ie,www.travelodge.ie,08701911687,08701911687,00448701911700
...
file2.csv
geo_latitude,geo_longitude,star_rating_value,name,city,country,chain_name,type,address,fax
44.449302,26.091212,4,minerva,bucharest,romania,minerva,hotel,street gheorghe manu number 2-4 sector- 1 010445 romania ,40213123963
44.436976,26.094423,3,opera,bucharest,romania,,hotel,brezoianu street no 37 sector 1 bucharest romania ,40213124011
54.5955,-5.9334,3,travelodge belfast,belfast,united kingdom,,hotel,15 brunswick street belfast bt2 7ge united kingdom ,441232232999
...
config.json
{
"field_names_1": [
"id",
"geo_latitude",
"geo_longitude",
"star_rating_value",
"name",
"city",
"country",
"chain_name",
"type",
"address",
"fax",
"postal_code",
"email",
"website",
"booking_phone",
"management_phone",
"hotel_phone"
],
"field_names_2": [
"geo_latitude",
"geo_longitude",
"star_rating_value",
"name",
"city",
"country",
"chain_name",
"type",
"address",
"fax"
],
"output_file": "output.csv",
"skip_training": false,
"training_file": "training.json",
"sample_size": 15000,
"recall_weight": 2
}
Do all your examples in the training file incude the 'fax' field?
On Tue, Oct 25, 2016 at 10:07 AM AlexandruMV notifications@github.com wrote:
i don't understand what i do wrong
csvlink file1.csv file2.csv --config_file config.json
i get
ValueError: Records do not line up with data model. The field 'fax' is in data_model but not in a record
file1.csv
turismatic_id,geo_latitude,geo_longitude,star_rating_value,name,city,country,chain_name,type,address,fax,postal_code,email,website,booking_phone,management_phone,hotel_phone 107,44.457973,26.091842,,minerva,bucharest,romania,,hotel,street gheorghe manu number 2-4 sector- 1 010445 romania,40213123963,010445,reservation@minerva.ro,www.minerva.ro,0040213181294,0040213122738,+40213111555 <+40%2021%20311%201555> 108,44.435918,26.094242,,opera,bucharest,romania,,hotel,brezoianu street no 37 sector 1 bucharest romania,0040213124858,010132,info@hotelopera.ro,,,,0040213124857 118,54.595541,-5.933663,3,belfast central travelodge,belfast,united kingdom (great britain),travelodge,hotel,15 brunswick street belfast bt2 7ge united kingdom,441232232999,bt2 7ge,valerie.steinbeck@travelodge.ie,www.travelodge.ie,08701911687,08701911687,00448701911700 ...
file2.csv
geo_latitude,geo_longitude,star_rating_value,name,city,country,chain_name,type,address,fax 44.449302,26.091212,4,minerva,bucharest,romania,minerva,hotel,street gheorghe manu number 2-4 sector- 1 010445 romania ,40213123963 44.436976,26.094423,3,opera,bucharest,romania,,hotel,brezoianu street no 37 sector 1 bucharest romania ,40213124011 54.5955,-5.9334,3,travelodge belfast,belfast,united kingdom,,hotel,15 brunswick street belfast bt2 7ge united kingdom ,441232232999 ...
config.json
{ "field_names_1": [ "turismatic_id", "geo_latitude", "geo_longitude", "star_rating_value", "name", "city", "country", "chain_name", "type", "address", "fax", "postal_code", "email", "website", "booking_phone", "management_phone", "hotel_phone" ], "field_names_2": [ "geo_latitude", "geo_longitude", "star_rating_value", "name", "city", "country", "chain_name", "type", "address", "fax" ], "output_file": "output.csv", "skip_training": false, "training_file": "training.json", "sample_size": 15000, "recall_weight": 2 }
— You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub https://github.com/datamade/csvdedupe/issues/55#issuecomment-256062799, or mute the thread https://github.com/notifications/unsubscribe-auth/AAgxbbOYNe6y92vshtxmlAZ3vfRbuD2Nks5q3hs5gaJpZM4KgDlQ .
not all examples in the training data include fax field, but i have added this to config.json
"field_definitions" : [
{ "field" : "id", "type" : "String", "Has Missing" : true },
{ "field" : "geo_latitude", "type" : "String", "Has Missing" : true },
{ "field" : "geo_longitude", "type" : "String", "Has Missing" : true },
{ "field" : "star_rating_value", "type" : "String", "Has Missing" : true },
{ "field" : "name", "type" : "String" },
{ "field" : "city", "type" : "String", "Has Missing" : true },
{ "field" : "country", "type" : "String", "Has Missing" : true },
{ "field" : "chain_name", "type" : "String", "Has Missing" : true },
{ "field" : "type", "type" : "String", "Has Missing" : true },
{ "field" : "address", "type" : "String", "Has Missing" : true },
{ "field" : "fax", "type" : "String", "Has Missing" : true },
{ "field" : "postal_code", "type" : "String", "Has Missing" : true },
{ "field" : "email", "type" : "String", "Has Missing" : true },
{ "field" : "website", "type" : "String", "Has Missing" : true },
{ "field" : "booking_phone", "type" : "String", "Has Missing" : true },
{ "field" : "management_phone", "type" : "String", "Has Missing" : true },
{ "field" : "hotel_phone", "type" : "String", "Has Missing" : true }
]
same error: ValueError: Records do not line up with data model. The field 'fax' is in data_model but not in a record
The training examples has to have the 'fax' field even if it's null or empty.
On Wed, Oct 26, 2016 at 4:02 AM AlexandruMV notifications@github.com wrote:
not all examples in the training data include fax field, but i have added this to config.json
"field_definitions" : [ { "field" : "id", "type" : "String", "Has Missing" : true }, { "field" : "geo_latitude", "type" : "String", "Has Missing" : true }, { "field" : "geo_longitude", "type" : "String", "Has Missing" : true }, { "field" : "star_rating_value", "type" : "String", "Has Missing" : true }, { "field" : "name", "type" : "String" }, { "field" : "city", "type" : "String", "Has Missing" : true }, { "field" : "country", "type" : "String", "Has Missing" : true }, { "field" : "chain_name", "type" : "String", "Has Missing" : true }, { "field" : "type", "type" : "String", "Has Missing" : true }, { "field" : "address", "type" : "String", "Has Missing" : true }, { "field" : "fax", "type" : "String", "Has Missing" : true }, { "field" : "postal_code", "type" : "String", "Has Missing" : true }, { "field" : "email", "type" : "String", "Has Missing" : true }, { "field" : "website", "type" : "String", "Has Missing" : true }, { "field" : "booking_phone", "type" : "String", "Has Missing" : true }, { "field" : "management_phone", "type" : "String", "Has Missing" : true }, { "field" : "hotel_phone", "type" : "String", "Has Missing" : true } ]
same error: ValueError: Records do not line up with data model. The field 'fax' is in data_model but not in a record
— You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub https://github.com/datamade/csvdedupe/issues/55#issuecomment-256289794, or mute the thread https://github.com/notifications/unsubscribe-auth/AAgxbR-AN6AsTQCQRPWyULGsVKHVtEYeks5q3xcugaJpZM4KgDlQ .
ok. thank you
fields = [{'field' : 'Region', 'type': 'String'}, {'field' : 'Country', 'type': 'String'}, {'field' : 'Item_Type', 'type': 'String'}, {'field' : 'Sales_Channel', 'type': 'String'}, {'field' : 'Order_Date', 'type': 'String', 'has missing' : True}, ] deduper = dedupe.Dedupe(fields) ...
result: WARNING:dedupe.backport:Dedupe does not currently support multiprocessing on Windows ... ValueError: Records do not line up with data model. The field 'Region' is in data_model but not in a record
What is this error I want to debug it Help me.
I have same issue eventhough record contain a field
If you changed your dataset slightly (like adding new fields) it seems you've to delete your previous training.json file or adapt it to these new fields. Thought I'd share just in case!