metafacture / metafacture-core

Core package of the Metafacture tool suite for metadata processing.
https://metafacture.org
Apache License 2.0
69 stars 34 forks source link

setreplace does not work with filemap #381

Closed TobiasNx closed 2 years ago

TobiasNx commented 3 years ago

It seems that setreplace is not working with a externalfile via filemap: I have set up a simple case:

https://github.com/TobiasNx/notWorkingFlux/tree/main/setreplaceWithFile

<?xml version="1.0" encoding="UTF-8"?>
<metamorph xmlns="http://www.culturegraph.org/metamorph" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    version="1">
    <rules>
        <data source="name">
            <setreplace map="replace" />
        </data>
    </rules>
    <maps>
        <filemap name="replace" files="replaceAcademicTitle.tsv" />
    </maps>
</metamorph>

I receive following error message:

Exception in thread "main" org.metafacture.metamorph.MetamorphException: Error while executing the Metamorph transformation pipeline: null
        at org.metafacture.metamorph.DefaultErrorHandler.error(DefaultErrorHandler.java:33)
        at org.metafacture.metamorph.Metamorph.send(Metamorph.java:361)
        at org.metafacture.metamorph.Metamorph.dispatch(Metamorph.java:322)
        at org.metafacture.metamorph.Metamorph.access$000(Metamorph.java:68)
        at org.metafacture.metamorph.Metamorph$1.literal(Metamorph.java:205)
        at org.metafacture.mangling.StreamFlattener.literal(StreamFlattener.java:105)
        at org.metafacture.metamorph.Metamorph.literal(Metamorph.java:297)
        at org.metafacture.json.JsonDecoder.decodeValue(JsonDecoder.java:193)
        at org.metafacture.json.JsonDecoder.decodeObject(JsonDecoder.java:162)
        at org.metafacture.json.JsonDecoder.decode(JsonDecoder.java:146)
        at org.metafacture.json.JsonDecoder.process(JsonDecoder.java:110)
        at org.metafacture.json.JsonDecoder.process(JsonDecoder.java:33)
        at org.metafacture.io.RecordReader.emitRecord(RecordReader.java:111)
        at org.metafacture.io.RecordReader.process(RecordReader.java:100)
        at org.metafacture.io.RecordReader.process(RecordReader.java:39)
        at org.metafacture.io.FileOpener.process(FileOpener.java:100)
        at org.metafacture.io.FileOpener.process(FileOpener.java:40)
        at org.metafacture.flux.parser.StringSender.process(StringSender.java:38)
        at org.metafacture.flux.parser.Flow.start(Flow.java:110)
        at org.metafacture.flux.parser.FluxProgramm.start(FluxProgramm.java:156)
        at org.metafacture.runner.Flux.main(Flux.java:79)
Caused by: java.lang.UnsupportedOperationException
        at org.metafacture.metamorph.api.helpers.AbstractReadOnlyMap.entrySet(AbstractReadOnlyMap.java:85)
        at org.metafacture.commons.tries.SetReplacer.addReplacements(SetReplacer.java:38)
        at org.metafacture.metamorph.functions.SetReplace.process(SetReplace.java:32)
        at org.metafacture.metamorph.api.helpers.AbstractSimpleStatelessFunction.receive(AbstractSimpleStatelessFunction.java:35)
        at org.metafacture.metamorph.Data.receive(Data.java:36)
        at org.metafacture.metamorph.Metamorph.send(Metamorph.java:359)
        ... 19 more
dr0i commented 2 years ago

@TobiasNx please test the branch if it fixes the issue.

TobiasNx commented 2 years ago

Result seems to work: https://github.com/TobiasNx/notWorkingFlux/blob/b079f0be4fbe18b17619d0cf4946a22ffd3d22db/setreplaceWithFile/notitle.json

https://github.com/TobiasNx/notWorkingFlux/commit/b079f0be4fbe18b17619d0cf4946a22ffd3d22db

Even with multiple replacements in one literal.

+1

TobiasNx commented 2 years ago

Oh: It seems not to work for empty strings: See my three attemps: https://github.com/TobiasNx/notWorkingFlux/commit/2f3a817743b6a8ddc889d58ce3d60345b5172374 => no change https://github.com/TobiasNx/notWorkingFlux/commit/85bd1ef5ddd3221124cbc100f91d2935d0228ca0 => no change https://github.com/TobiasNx/notWorkingFlux/commit/5214444f629ba3cc9fee5ae0823e6d4c91ff8576 => since insonsistent with "" changes not to emtpy but to ""

dr0i commented 2 years ago

This could be provided, but only by introducing a new parameter for the function (to be backward compatible). This would be a new feature, not just a bug fix. So, you really need this, give a +1

TobiasNx commented 2 years ago

If this is an additional feature, then let*s try a workaround first and introduce the new feature if needed with an new ticket.

So +1 on the setReplace functionality with map. The new feature would be nice but can be solved otherwise (for now)

dr0i commented 2 years ago

Note then: to allow the empty replacement set split.split(line, -1); in FileMap:88.

TobiasNx commented 2 years ago

Note then: to allow the empty replacement set split.split(line, -1); in FileMap:88.

Could you explain this a little bit more? I would not know how to use it.

dr0i commented 2 years ago

That's just a reminder of how to implement the feature if an issue is opened.