Closed mihaela-bornea closed 2 years ago
@mihaela-bornea Good question! Maybe you can sort the triples in two queries and then perform string match? That would be faster than two nested for loops.
This sounds like a good feature to include in the eval script itself.
Regarding RIR vs SPARQL evaluation.
As you mention in the paper your system works with the RIR representation of SPARQL and this is because (Herzig et al., 2021) showed that RIR obtains better results for T5. Can you provide the scripts to transform SPARQL to RIR and RIR to SPARQL?
There might be advantages to using SPARQL directly. In this case the question is how to report comparable result with RIR. One option would be to use transformation script and report both RIR and SPARQL accuracies and BLEU. Any other suggestions?
@mihaela-bornea Hi Mihaela, I have already uploaded the code to transform SPARQL to and from RIR for MCWQ, as well as an evaluation based on prefix, triple and filter match rather than exact string match. You can find the code in this repo: https://github.com/ruixiangcui/compir .
Hi. This looks great.
Can you please clarify where to get the data for this argument
--train_data_path="datasets/$split/train.$lang.txt"
Also I understand the parameter below specifies SPARQL to RIR. Can you please confirm ?
--transformation="rir" \
Thank you !
The data is formatted as:
IN: <source> OUT: <target>\n
Where <source>
is the natural language utterance and <target>
is the original SPARQL program.
Yes, it is. Note that we have not experimented MCWQ with other transformation.
The evaluation script output at result_en.txt indicates that the performance metric is exact match at string level.
Bleu score is also printed by the evaluation script.
How do you suggest handling systems predicting equivalent SPARQL queries where the order of triples is permuted ?