Closed kfuku52 closed 2 years ago
On it!
amalgkit config
is done:
https://github.com/kfuku52/amalgkit/commit/dd20107aa74597a11d324a767c804cb12184ec47
Had to hard-code the file creation for all .config files, since I wanted to keep the headers explaining how to use the config files.
Currently there are three parameters:
--out_dir
, default: ./
directory where config
will use as working_directory (will create a folder named config
in --out_dir
)
--config_dir
, default empty_config
name of the folder where the config files will be created in
--overwrite
, default no
if --config_dir
already exists, amalgkit config
will stop immediately if --overwrite
is no
. If yes
, config files will be overwritten insetead.
Resulting directory infrastructure:
out_dir <-- can be changed
│
└───config <-- can not be changed
│ │
│ └───config_dir <-- can be changed
│ │ control_term.config
│ │ exclude_id.config
│ │ ...
Thanks. Can we stick to the way used in csubst dataset
(i.e., just copy files to a specified directory)? The config files should be near-empty, but not completely empty because there are commonly applied conditions, such as "Filter" "type rnaseq"
in search_term_other.config
, and it is difficult to modify if hardcoded in the python script. I see no problem with headers, so could you explain it more?
Ah, now the csubst
code makes more sense to me. There is a nearly empty set of files somewhere in the installation folder (i.e. previously manually created), and the command just copies them somewhere else?
Whereas I created the files from scratch.
Yes, the files for csubst dataset
are placed here: https://github.com/kfuku52/csubst/tree/master/csubst/dataset
alright, took me a while to figure out how this works, but here's the new version: https://github.com/kfuku52/amalgkit/commit/0fe5bb77ae764ced3724428bb94a900a307b2261
At first I wanted to use pkg_resources
as in csubst
, but upon further investigation it is apparently better to use importlib.resources
to retrieve files from within the package, since pkg_resources
is deprecated now.
Anyways, how amalgkit config
works now is the following:
--overwrite
terminates the process if the target directory is already there and set to no
.
--config_dir
this is the name of the config directory to be created. i.e. the destination (defaults to the name of --config)
--config
this is the name of the config directory stored in the amalgkit package. i.e. the source
currently there are four options: 'test', 'plantae', 'vertebrate' and 'base'. 'base' is the nearly empty dataset, where I left just the header and 1-3 lines as an example within each of the config files.
--out_dir
working directory
Example:
amalgkit config --config base --config_dir my_config --out_dir ./
will create ./config/my_config/
and copy all .config
files from 'base' in there.
amalgkit config --config vertebrate --out_dir ./
will create ./config/vertebrate/
and copy all .config
files from vertebrate in there.
amalgkit config
will create ./config/base
and copy all .config
files from 'base' in there.
Note: I had to make a copy of the config
folder inside amalgkit and give it a different name for this to work. The config
folder is now redundant, so we can delete it.
Did your test go well with a fresh install with pip? I got an error:
(base) wbo1129:~ kef74yk$ amalgkit --version
amalgkit version 0.6.7.2
(base) wbo1129:~ kef74yk$ amalgkit config --config base --config_dir my_config --out_dir ./
amalgkit config: start
Checking config directory ...
Traceback (most recent call last):
File "/Users/kef74yk/opt/miniconda3/bin/amalgkit", line 404, in <module>
args.handler(args)
File "/Users/kef74yk/opt/miniconda3/bin/amalgkit", line 105, in command_config
config_main(args)
File "/Users/kef74yk/opt/miniconda3/lib/python3.9/site-packages/amalgkit/config.py", line 85, in config_main
create_config_from_package(args)
File "/Users/kef74yk/opt/miniconda3/lib/python3.9/site-packages/amalgkit/config.py", line 70, in create_config_from_package
config_files = ir.files(config_base).rglob('*.config')
File "/Users/kef74yk/opt/miniconda3/lib/python3.9/importlib/resources.py", line 147, in files
return _common.from_package(_get_package(package))
File "/Users/kef74yk/opt/miniconda3/lib/python3.9/importlib/resources.py", line 49, in _get_package
module = _resolve(package)
File "/Users/kef74yk/opt/miniconda3/lib/python3.9/importlib/resources.py", line 40, in _resolve
return import_module(name)
File "/Users/kef74yk/opt/miniconda3/lib/python3.9/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
File "<frozen importlib._bootstrap>", line 972, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'config_dir'
I pushed another update, which should fix this issue. Can you try it out on your end? https://github.com/kfuku52/amalgkit/commit/eb22ce5cc382472079b45cbfa8669063c869a2b6
The latest version worked well. Thank you!
Great! I will close this for now, then.
amalgkit metadata
should be started more easily, and there should be a single command that generates all necessary config files (near empty to be neutral enough for any purposes/organisms). A new subcommand should look likecsubst dataset
and may be namedamalgkit config
. @Hego-CCTB Could you implement it?