Open bilgeyucel opened 1 day ago
Some additional context.
As of now the FileTypeRouter
doesn't give the users an explicit way to pass additional metadata to converters that receive the routed sources.
The FileTypeRouter
is also wrong right now cause it states that all its outputs are of type List[Path]
, that's incorrect cause it should actually be List[Union[Path, ByteStream]]
. Basically the same as its sources
input with str
, cause internally str
are converted to Path
and returned that way.
I propose we fix the output type so that it correctly reflects the actual output. Additionally we change the FileTypeRouter
to convert all the input sources
to ByteStream
if any meta
is sent by the user, that way we can route the files together with the meta
without adding new outputs to the Component. We must convert to ByteStream
only if meta
is received.
Is your feature request related to a problem? Please describe. When a preprocessing pipeline starts with
FileTypeRouter
, which is usually the case when we use multiple converters, it's not possible to provide meta information for filesDescribe the solution you'd like Let's add
meta
input to theFileTypeRouter
and this component can useByteStream
dataclass to pass this info to converters.Describe alternatives you've considered Having separate metadata outputs for each file type:
router.text/plain_meta
Additional context The same issue opened a year ago #6392
cc: @silvanocerza