srvaroa / labeler

Label manager for PRs and Issues based on configurable conditions
https://github.com/srvaroa/labeler
MIT License
83 stars 39 forks source link

Body filtering #156

Open Caellian opened 1 month ago

Caellian commented 1 month ago

I'd like to use this for labeling issues in Conky, but as part of issue reports people often paste their configuration files which contain keywords I'd like to check for. This means that checking a lot of labels in the body would yield a lot of false positives.

Example:
#### Description Some issue that mentions HW **sensors** not working. Therefore the issue should only be labeled with `sensors` label. #### Config User pastes their whole config as part of bug report: ```lua conky.text = [[ HDD: ${hdd_variable} -- HDD is detected and issue labeled `disk io` Volume: ${volume_variable} -- Volume is detected and issue labeled `audio` ]] ```

Not the best or most accurate example, but I hope it gets the point across, based on contents of that code block the body could trigger all labels.

I suggest adding a mechanism that allows excluding parts of MD content from regex, either by checking only content under a specific title (e.g. Details/Description) or by excluding code blocks (might be easier to implement (remove all "```(\w+)?.+```")).

srvaroa commented 2 weeks ago

Hi, sorry I forgot to reply on this one. This makes sense, I have to experiment a bit with something similar to your suggestion. I can see something like:

- body:
  pre-filter: <remove-all-regex>
  filter: <actual matching regex>
Caellian commented 1 week ago

Thanks for the reply.

I think that simple title-based filtering might be a good addition to pre-filter down the line. It's commonly much simpler to exclude/include everything in section than it would be to write a regex that removes stuff. This is less clear from the example because \[\[.*?\]\] works in this case perfectly, but there's cases where removed content can only be differentiated based on location because matched content is structured the same:

Example

Description

\[...\] I believe same might be the case with **Ubuntu**, but I don't have it installed so that needs confirmation. Also, **Void Linux** doesn't use systemd so that's very likely also the case there.

System

OS: Linux Distro: **Arch Linux**

The most sane way of dealing with this using regex would be to remove .*## System, but it feels more faulty than being able to extract sections.

Go has a crate for markdown, but I think this would be more complicated than simply splitting input on lines that start with #{1,6} and internally producing something like:

sections:
  - level: 3
    title: <title>
    body: <content>

Filtering can then use:

- body:
  - title:
      - content: <title name regex>
        level: <title level number>
      - content: <title name regex>
        level: <title level number>
    label: <label name>

# Probably requires some additional abstraction:
- body:
  - section:
      - title: <title name regex>
        <any body items except section>
    label: <label name>

If both get added, I suggest pre-filter to run after that.