Should the leaning domain contain only assertion texts (like "Python is a high-level general-purpose programming language")?

Hi. Should the leaning domain contain only assertion texts (like "Python is a high-level general-purpose programming language" in your example)? In your pipeline the first step is Query Generation: For a given text from our domain, we first use a T5 model that generates a possible query for the given text. E.g. when your text is “Python is a high-level general-purpose programming language”, the model might generate a query like “What is Python”. You can find various query generators on our doc2query-hub. Does that mean that texts which couldn't be converted into queries (e.g. "Investment consulting for legal entities and individuals.") cannot be used for training?

UKPLab / gpl

Should the leaning domain contain only assertion texts (like "Python is a high-level general-purpose programming language")? #30