IBM / fiben-benchmark

A Finance Dataset Benchmark for Natural Language Queries
Apache License 2.0
18 stars 4 forks source link

data instances #1

Open arademaker opened 3 years ago

arademaker commented 3 years ago

In FIBEN.sql we have the DDL for creating the DB, what about the data to actually populate the DB? Where can I find the data?

abdulquamar commented 3 years ago

We have added the data along with a DB2 loading script in a zipped folder (data.zip). Please try it out and let us know if you run into any issues.

nnarodytska commented 2 months ago

Hi Abdul (@abdulquamar ) and all,

I am looking into the data (data.zip) and queries(FIBEN_Queries). I noticed that there are a lot of queries that return empty results.

Here are a few types of questions:

Several questions require filtering on  MonetaryAmount.HASAMOUNT, e.g.   MonetaryAmount.HASAMOUNT > 1500.0 or  MonetaryAmount.HASAMOUNT <=1 . However, all nonempty values in " MonetaryAmount.HASAMOUNT " column are more than 1 and less than 1500. So the output is empty. Concrete example: Show me the stock if its last traded value is higher than 1500

Several questions  ask about " Nam Davarian"/"Hakon Schuster"/"Luis Statz" but these people are not in the Person table. Concrete example: In how many states does Luis Statz live

Several questions require filtering on  SecuritiesTransaction.Hassettlementdate, e.g. SecuritiesTransaction.Hassettlementdate  >= '2018-01-01 00:00:00.000'. However, all settlement dates in SecuritiesTransaction.csv are before 2018.' Concrete example:  "Find all transaction on IBM stocks in  2018"

I just want to check with you whether you observed such behavior.

Thanks!

nina

Gxyrious commented 1 month ago

We have added the data along with a DB2 loading script in a zipped folder (data.zip). Please try it out and let us know if you run into any issues.

I found that much of the tables have empty data in data.zip, which leads to the empty exec result of some sql. I just wonder if you have miss some data.