Open piconti opened 2 weeks ago
Other tasks and things that are to do/were done as part of this migration:
KNOWN_JOURNALS
with the new titles, and make it a dict to include the origin partner of the title.data
, config
and logs
directoriesschemas
as a git submodulepush_to_git
set to true
, and set it to None
by default when instantiating the manifestversioning.helpers
to a new versioning.aggregators
module.
As mentioned in this issue, it was decided to separate the modules of impresso-commons into two:
This issue is thus for this second part of this restructuration, as described here.
utils.s3.py
-> moved to a newio
module, potentially merged withs3_delete.py
.utils.s3_delete.py
-> moved to a newio
moduleutils.umia.py
-> kept in current impresso_commons as legacy, moved to another repo if needed.utils.utils.py
-> kept as-is but as module and not as submoduleutils.dask_utils
-> moved to newdata_processing
submodule, keeping only relevant functionsutils.config_loader.py
-> removedpath.path_s3.py
-> moved to newio
modulepath.path_fs.py
-> moved to impresso-text-acquisition, keeping only relevant functionsimages.img_utils.py
-> kept in current impresso_commons as legacy, moved to another repo if needed.images.olive_boxes.py
-> kept in current impresso_commons as legacy, moved to another repo if needed.classes.contentitem.py
-> kept in current impresso_commons as legacy, moved to another repo if needed.text.rebuilder.py
-> moved to impresso-text-acquisitiontext.helpers.py
-> moved to impresso-text-acquisitionversioning.data_manifest.py
-> kept as-is in aversioning
moduleversioning.data_statistics.py
-> kept as-is in aversioning
moduleversioning.helpers.py
-> kept as-is in aversioning
moduleversioning.compute_manifest.py
-> kept as-is in aversioning
moduleschemas
-> kept as-isPotentially, other modules and submodules might be added, in a
text
submodules. These are modules hgihly reusable for text processing:text_utils.py
tokenization_utils.py
ner_utils.py
to be added by @EmanuelaBoros.In addition to these modifications, all the code needs to be