Open b4naki opened 1 year ago
I think this file can be downloaded from download, extract, and then rename the csv to home-depot-sentence-similarity.csv and place into the data folder
I think this file can be downloaded from download, extract, and then rename the csv to home-depot-sentence-similarity.csv and place into the data folder
Thank you this worked.
Is there a way to download this without entering a phone number?!
maybe: 1 download data from 2 extract and to Dir Data 3 use code below to generate home-depot-sentence-similarity.csv
using Microsoft.ML;
using Microsoft.ML.Data;
using Microsoft.ML.Transforms;
namespace SentenceSimilarity
internal class GenData
// id product_uid product_title search_term relevance
// 2 100001 Simpson Strong-Tie 12-Gauge Angle angle bracket 3
public class HomeDepot
public int id { get; set; }
public int product_uid { get; set; }
public string product_title { get; set; }
public string search_term { get; set; }
public string relevance { get; set; }
private class ProdDescCustomAction : CustomMappingFactory<HomeDepot, CustomMappingOutput>
// We define the custom mapping between input and output rows that will
// be applied by the transformation.
public static void CustomAction(HomeDepot input, CustomMappingOutput
output) => output.product_description = prodDesc[input.product_uid.ToString()];
public override Action<HomeDepot, CustomMappingOutput> GetMapping()
=> CustomAction;
// Defines only the column to be generated by the custom mapping
// transformation in addition to the columns already present.
private class CustomMappingOutput
public string product_description { get; set; }
static Dictionary<string, string> prodDesc = new Dictionary<string, string>();
static void Main(string[] args)
var mlContext = new MLContext(seed: 1);
var DataPath = Path.GetFullPath(@"..\..\..\..\Data\product_descriptions.csv");
IDataView dv = mlContext.Data.LoadFromTextFile(DataPath, hasHeader: true, separatorChar: ',', allowQuoting: true,
columns: new[] {
new TextLoader.Column("product_uid",DataKind.String,0),
new TextLoader.Column("product_description",DataKind.String,1)
foreach (var row in dv.Preview(maxRows: 15_0000).RowView)
string uid="", desc="";
foreach (KeyValuePair<string, object> column in row.Values)
if (column.Key == "product_uid")
uid = column.Value.ToString();
desc= column.Value.ToString();
prodDesc[uid] = desc;
DataPath = Path.GetFullPath(@"..\..\..\..\Data\train.csv");
IDataView dataView = mlContext.Data.LoadFromTextFile<HomeDepot>(DataPath, hasHeader: true, separatorChar: ',', allowQuoting: true);
var preViewTransformedData = dataView.Preview(maxRows: 5);
foreach (var row in preViewTransformedData.RowView)
var ColumnCollection = row.Values;
string lineToPrint = "Row--> ";
foreach (KeyValuePair<string, object> column in ColumnCollection)
lineToPrint += $"| {column.Key}:{column.Value}";
Console.WriteLine(lineToPrint + "\n");
var pipeline = mlContext.Transforms.CustomMapping(new ProdDescCustomAction().GetMapping(), contractName: "product_description");
var transformedData = pipeline.Fit(dataView).Transform(dataView);
Console.WriteLine("save file");
using FileStream fs = new FileStream(Path.GetFullPath(@"..\..\..\..\Data\home-depot-sentence-similarity.csv"), FileMode.Create);
mlContext.Data.SaveAsText(transformedData, fs, schema: false, separatorChar:',');
After these operation, you can see the data file home-depot-sentence-similarity.csv.
maybe: 1 download data from
Reposting the link is not a help. The problem of phone number is required still exist. I cannot download it without logging in. I dont have a google account (creating one wants my phone number) same others. Even creating a Kaggle account is asking for my phone number.
Here is the processed data file.
in the sentence similarity project the path
var dataPath = Path.GetFullPath(@"..\..\..\..\Data\home-depot-sentence-similarity.csv");
does not exist.