Open CerberusQc opened 2 months ago
Note:
For a cursory review:
bufio.NewScanner
is used for:
soci
command IO (processSociIO
)notation
command IO (processNotationIO
)cosign
command IO (processCosignIO
)/proc/net/*
files (ReadStatsFileData
)/etc/os-release
(DistroName
)parseEnvVars
) provided via --env-file
PipeTagger.Run
)/etc/hosts
files (parseHostsButSkipMarkedRegion
)I believe that:
That leaves 7, which is certainly a problem
Hey @CerberusQc
May I interest you in patching https://github.com/containerd/nerdctl/blob/main/pkg/composer/pipetagger/pipetagger.go#L97-L110 ? :-)
Sorry for the late response, I am currently moving into a new house, I won't be able to work on this before 1-2 weeks.
I am wondering if it would be better to move the function into it's own struct to be more generic and allow to stay DRY in that case. If so, I am not really familiar with GO folders structure, where should I put the generic class ?
I was thinking about something like
type LogProcessor struct {
wg *sync.WaitGroup
reader io.Reader
dataChan chan string
}
func NewLogProcessor(wg *sync.WaitGroup, reader io.Reader, dataChan chan string) *LogProcessor {
return &LogProcessor{
wg: wg,
reader: reader,
dataChan: dataChan,
}
}
func (p *LogProcessor) Process() {
defer p.wg.Done()
defer close(p.dataChan)
r := bufio.NewReader(p.reader)
var err error
for err == nil {
var s string
s, err = r.ReadString('\n')
if len(s) > 0 {
p.dataChan <- strings.TrimSuffix(s, "\n")
}
if err != nil && err != io.EOF {
log.L.WithError(err).Error("failed to read log")
}
}
}
This is just a draft and did not run it, naming need improvement.
No problem. Congrats on the new house!
On your code suggestion - let's chat on your (future) PR when you get there.
On the location, what about under pkg/logging/processor.go
? (and then let see what maintainers think)
See you in a few weeks then ;).
What is the problem you're trying to solve
As discussed in https://github.com/containerd/nerdctl/issues/3343, the usage of bufio.Scanner when we do not have the control over the length of the message processed by the scanner can put the Scanner in a I/O error state failing silently and blocking the process sending a message.
@apostasie recommend that the usage of bufio.Scanner across the project is revisited.
Describe the solution you'd like
A solution is suggested with https://github.com/containerd/nerdctl/pull/3366 but might not be a "one size fit all solution".
Additional context
Documentation
The documentation state the following:
Scanning stops unrecoverably at EOF, the first I/O error, or a token too large to fit in the Scanner.Buffer. When a scan stops, the reader may have advanced arbitrarily far past the last token. Programs that need more control over error handling or large tokens, or must run sequential scans on a reader, should use bufio.Reader instead.