e.g., let x represent english (en) sentences and y represent the yoruba (yor) translated sentences;
x_1: how are you my son.
x_2: the fuel in the lamp is about to finish.
y_1: bawo ni omo mi.
y_2: epo ti o wa ninu fitila naa ti fẹrẹ pari.
concatenated sentences
x: how are you my son. the fuel in the lamp is about to finish.
y: the fuel in the lamp is about to finish. epo ti o wa ninu fitila naa ti fẹrẹ pari.
Sentence concatenation, as explored in this paper (Measuring the Impact of Data Augmentation Methods for Extremely Low-Resource NMT), involves the random concatenation of multiple sentences with as a separator token between concatenated sentences. Here, we need a class or script that performs the above for a given language pair. More information can be found in this paper Sentence Concatenation Approach to Data Augmentation for Neural Machine Translation