Closed ryantwolf closed 1 month ago
@ryantwolf I think we should have this functionality. I was trying to use the Curator through a notebook and had to do some gymnastics to create a client using get_client()
.
If this is too much work for now, maybe a simple workaround would be to have an equivalent of add_distributed_args()
, but return a dictionary of the required arguments. This way the user can pass that dict
instance to get_client()
and still be able to use the code in a notebook environment.
Main items:
get_client
function to not use argparseparse_client_args
function to bridge argparse to newget_client
get_client
invocationsget_client
to be imported from root module likefrom nemo_curator import get_client
Other changes:
find_exact_duplicates.py
determines whether to use CPU or GPU.