go-graphite / go-carbon

Golang implementation of Graphite/Carbon server with classic architecture: Agent -> Cache -> Persister
MIT License
805 stars 123 forks source link

[BUG] Whispers under symlink are not returned for queries with wildcards #572

Open interfan7 opened 12 months ago

interfan7 commented 12 months ago

Describe the bug (also described in carbonapi issues) For metrics in Whisper files which are under symlink (in the path):

Seems like something in the search engine is effected by a path component being a symlink, which in our case is directory y.

Logs No error messages in both go-carbon and carbonapi while reproducing it. In go-carbon log I do see INFO messages of metrics which are not under symlink. In carbonapi log I see INFO messages with the query search text i.e. x.y*.z.w.

Go-carbon Configuration: Mostly default, I will provide config if a contributor requests. I believe it may not be relevant though given the nature of the issue.

Metric retention and aggregation schemas See above.

Simplified query (if applicable) See description 👆🏻

Additional context go-carbon 0.17.1 and carbonapi 0.16.0~1.

deniszh commented 12 months ago

Hi @interfan7,

Could you please share your go-carbon.conf file and please provide details where exactly symlink located and where it's pointing to and what's files are located there (because it's not clear from description above). For example, whisper files located in /var/lib/whisper, symlink is /var/lib/whisper/path, pointing to /var/lib/anotherwhisper, where's only *.wsp files located.

deniszh commented 12 months ago

Well, I just checked code - carbonserver uses filepath.Walk which is do not support symlink by design. In theory, it can be implemented, but can significantly affect code complexity and performance, I'm doubt that we will implement it in near future.

interfan7 commented 12 months ago

@deniszh

Would you mind to note what is the code complexity? Is it just the addition of this logic (kind of pseudo-code because I'm not a Go programmer) inside the walk:

if (file is symlink)
{
   (realpath, error) = symlink_root = filepath.EvalSymlinks(file)
   if (!error)
   {
      filepath.WalkDir(realpath, same_fn)
   }
}

This whole logic could be wrapped under user-defined config flag for performance preference although I'd hope the added logic is negligible in performance versus the actual read of a Whisper file.

BTW according to the doc filepath.WalkDir is more efficient than filepath.Walk.

deniszh commented 11 months ago

Hi @interfan7

I'm not considering myself as professional Go programmer, but code above is more complex than just simple filepath.Walk, isn't it? :) I just want to emphasize that we'll not use symlink in prod, hence we will probably not implement symlink support in go-carbon in the near future. Also, please note that we're talking about carbonserver here and all related machinery (trie/trigram indices etc), probably normal carbonlink still support symlinks. Not sure if it will help in your case, though. PRs from community are always welcome and appreciated, of course.

deniszh commented 11 months ago

BTW according to the doc filepath.WalkDir is more efficient than filepath.Walk.

Yes, but WalkDir loads dir content in memory, can be quite heavy for big directories.

Civil commented 11 months ago

My .02: if you are traversing symlinks you need to some way to workaround endless loops as well.