facebookresearch / fairseq2

FAIR Sequence Modeling Toolkit 2
https://facebookresearch.github.io/fairseq2/
MIT License
700 stars 83 forks source link

using pipeline_builder shared pointer multiple times lead to segfaults #369

Open artemru opened 8 months ago

artemru commented 8 months ago

Describe the bug: Segfault during the pipeline creation

Describe how to reproduce:

from fairseq2.data import read_sequence
from fairseq2.data.data_pipeline import DataPipeline, DataPipelineBuilder

pipeline_build = read_sequence(list(range(100)))  # this's shared for two shuffling operations

concat_pipe = DataPipeline.concat([pipeline_build.shuffle(10).and_return(),
                                   pipeline_build.shuffle(10).and_return()]).and_return()
# this should have raised 
next(iter(concat_pipe))

Describe the expected behavior: This should raise an explicit RuntimeError

cbalioglu commented 8 months ago

Thanks for the report! I will investigate it today.