esmero / strawberryfield

A Field of strawberries
GNU Lesser General Public License v3.0
10 stars 5 forks source link

The Same remote source ingested twice inside a single ADO will make File persister Fail #272

Open DiegoPino opened 1 year ago

DiegoPino commented 1 year ago

What?

Extra edge case (and a fatal one)

https://github.com/esmero/strawberryfield/blob/d498c09c828beebb34411720f2ed36b58162a507/src/StrawberryfieldFilePersisterService.php#L794-L816

But if e.g an AMI column (images) contains http://myimage.com/image1.jpeg and (docs) contains http://myimage.com/image2.jpeg when persisting the matching will mean that the flattener will bring url as an array containing TWICE the same final destination.

The solution (hard to trigger error) is in the case of an ARRAY with URLs, we can compare them, if both are the same, then we pick one, if not, then well, that can not be and we die with errors

DiegoPino commented 1 year ago

@alliomeria what you saw today with the double /OBJ datastreams

aksm commented 1 year ago

@DiegoPino Have you started work on this already? Should I try to tackle it?