[Feature] Support InfiniteBench

OpenBMB has released a new long-context benchmark: https://github.com/OpenBMB/InfiniteBench

Some of its tasks such as Retrieve.PassKey1, Retrieve.Number, Retrieve.KV2 has been used in many papers recently.

Task Name	Context	# Examples	Avg Input Tokens	Avg Output Tokens	Description
En.Sum	Fake Book	103	171.5k	1.1k	Summarization of a fake book created with core entity substitution.
En.QA	Fake Book	351	192.6k	4.8	Free-form question answering based on the fake book.
En.MC	Fake Book	229	184.4k	5.3	Multiple choice questions derived from the fake book.
En.Dia	Script	200	103.6k	3.4	Identification of talkers in partially anonymized scripts.
Zh.QA	New Book	175	2068.6k	6.3	Question answering on a set of newly collected books.
Code.Debug	Code Document	394	114.7k	4.8	Finding which function in a code repo contains an crashing error (in multiple choice form).
Code.Run	Synthetic	400	75.2k	1.3	Simulating execution of multiple simple, synthetic functions.
Math.Calc	Synthetic	50	43.9k	43.9k	Calculations involving super-long arithmetic equations.
Math.Find	Synthetic	350	87.9k	1.3	Finding special integers in a lengthy list.
Retrieve.PassKey1	Synthetic	590	122.4k	2.0	Retrieving hidden keys in a noisy long context.
Retrieve.Number	Synthetic	590	122.4k	4.0	Locating repeated hidden numbers in a noisy long context.
Retrieve.KV2	Synthetic	500	89.9k	22.7	Finding the corresponding value from a dictionary and a key.

open-compass / opencompass