Closed pombredanne closed 6 years ago
@pombredanne Currently each pre-scan
plugin works in a way that if an option such as --ignore abc
is passed, any plugin in the pre-scan
entrypoint with the entry name ignore
is initialized with the user input abc
.
If a plugin provides more than one option, what do you think would be a better approach?
--ignore-glob '*txt' --ignore-regex '^\.'
will call the ignore
plugin as
plugin('*.txt')
plugin('^\.')
or
--ignore-glob '*txt' --ignore-regex '^\.'
will call the ignore
plugin as
plugin(dict('ignore-glob'='*.txt', 'ignore-regex'='^\.'))
Also, should we add a common plugin prefix to the options? say if ignore
plugin defines 2 options glob
and regex
, do we add the plugin name such as --ignore-glob
and --ignore-regex
or we let the plugins practice complete autonomy in the naming scheme?
@yashdsaraf Thanks... here is my thinking on this:
We should generalize the notion of plugin to decouple the UI (e.g. the CLI option name) from the actual declaration (as an entry point). This would mean that every plugin would have the same structure and could contribute or more options as they please and define these options entirely.
With this generalization a plugin becomes something that receives an iterable of resources and returns an iterable of resources. We should therefore better specify this resource object. In the future a plugin could express interest only in some resource that match certain criteria too.
With this, the only thing that a plugin would declare in the entry point is its "name" and its plugin callable for a given stage of the scan (pre, scan, post, output). All options would be provided by the plugin class itself. And all plugins would have the same interface: act on a stream of Resources.
This would mean reworking the way all the plugins work today to a single common way. When they are invoked is the only thing that would be special to a given plugin. This would IMHO pave nicely the support for scan plugins (and later the support for plugins that could be written in another language than Python)
Does this make sense?
Okay, this is what I got from that
resources
as input and return the same after processingResource
class will need some modifications to be handled by all the pluginsI think it's better to start restructuring the plugins to take in iterable of resource, rest points will follow automatically.
perfect!
@pombredanne Like you said, ultimately all the plugins would return an iterable of resources. In this case, how would the output plugins work?
Good point: the output plugins could eitherr:
The later is likely cleaner
I'm sorry I didn't get that, could you give an example of what should happen if a user calls scancode cli with multiple format options.
See the discussion in #789
Ah, got it. Each option supplied by the format plugins would take the output file path as input. Also, what happens if a path is not given? Do we print the output to stdout or simply abort due to the missing param?
@yashdsaraf no path given would mean stdout output which would be messy if there are more than one ; ) but that's OK, we cannot prevent folks to shoot themselves in the foot
@yashdsaraf I think what would be most useful for @haikoschol is to have this available for post-scan plugins right now. Do you think you can have a look into this? Or shall I take a stab?
Shoot, in working to encapsulate Resources with PyFilesystem, I did hit https://github.com/PyFilesystem/pyfilesystem2/issues/120 and #688 .... sigh. I need to beef things up to make things work :|
FYI I submitted a PR to https://github.com/PyFilesystem/pyfilesystem2/pull/121 in order to have proper handling of non-fs decoable bytes-only paths. Based on this I am going forward with the refactoring of the Resource and scan pipeline around PyFilesystem2 which should provide a clean abstraction and simplify vastly creating new plugins
Merged in develop with #885 !
We should have a way for a plugin to provide not one, but several CLI options Also a plugin should be able to provide fully configured click Options as options and not be constrained in what the options args are.
This is needed whenever more than one option is required or when more advanced settings need to be provided for a given option. This would be needed for scan plugins where several have multiple options.
@yashdsaraf ping :)