Closed rahulmool closed 1 year ago
I did that but the problem is in dpr.utils.data_utils . There is no function read_xict_samples_from_json_files
You are right. Here is the function:
def read_xict_samples_from_json_files(paths: List[str], upsample_rates: List = None) -> List:
results = []
if upsample_rates is None:
upsample_rates = [1] * len(paths)
assert len(upsample_rates) == len(paths), 'up-sample rates parameter doesn\'t match input files amount'
for i, path in enumerate(paths):
with open(path, 'r', encoding="utf-8") as f:
logger.info('Reading file %s' % path)
data = [json.loads(l) for l in f.readlines() ]
upsample_factor = int(upsample_rates[i])
data = data * upsample_factor
results.extend(data)
logger.info('Aggregated data size: {}'.format(len(results)))
return results
I will update the README later.
when i run run_xict.sh it shows following error. It seems to me like you have not included dpr in this repository.
Traceback (most recent call last): File "run_xict.py", line 32, in
from dpr.utils.data_utils import ShardedDataIterator, read_xict_samples_from_json_files, Tensorizer
ImportError: cannot import name 'read_xict_samples_from_json_files' from 'dpr.utils.data_utils' (/scratch/22cs60r72/InformationRetrival/copy/CORA/mDPR/mool/lib/python3.7/site-packages/dpr/utils/data_uti$
/scratch/22cs60r72/InformationRetrival/copy/CORA/mDPR/mool/lib/python3.7/site-packages/torch/distributed/launch.py:186: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use_env is set by default in torchrun.
If your script expects
--local_rank
argument to be set, please change it to read fromos.environ['LOCAL_RANK']
instead. See https://pytorch.org/docs/stable/distributed.html#launch-utility for further instructionsFutureWarning, ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 40046) of binary: /scratch/22cs60r72/InformationRetrival/copy/CORA/mDPR/mool/bin/python Traceback (most recent call last): File "/home/apps/DL-CondaPy3.7/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/home/apps/DL-CondaPy3.7/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/scratch/22cs60r72/InformationRetrival/copy/CORA/mDPR/mool/lib/python3.7/site-packages/torch/distributed/launch.py", line 193, in
main()
File "/scratch/22cs60r72/InformationRetrival/copy/CORA/mDPR/mool/lib/python3.7/site-packages/torch/distributed/launch.py", line 189, in main
launch(args)
File "/scratch/22cs60r72/InformationRetrival/copy/CORA/mDPR/mool/lib/python3.7/site-packages/torch/distributed/launch.py", line 174, in launch
run(args)
File "/scratch/22cs60r72/InformationRetrival/copy/CORA/mDPR/mool/lib/python3.7/site-packages/torch/distributed/run.py", line 713, in run
)(*cmd_args)
File "/scratch/22cs60r72/InformationRetrival/copy/CORA/mDPR/mool/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 131, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/scratch/22cs60r72/InformationRetrival/copy/CORA/mDPR/mool/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 261, in launch_agent
failures=result.failures,
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: