When using Gen3-DRS option, no information is provided if a file in the input.csv is not present in the manifest.json file.
Solution
Add a logging file with the file names in the input.csv that are not present in the manifest.json file
Implementation
This should be implemented in the filter_manifest.py helper script by comparing the file names the resulting filtered manifest with the original input.csv file
if len(reads_df[~reads_df['file_name'].isin(manifest_df['file_name'])])>0:
print("The following file_name IDs where not found in manifest:")
print(reads_df[~reads_df['file_name'].isin(manifest_df['file_name'])])
reads_df[~reads_df['file_name'].isin(manifest_df['file_name'])].to_csv("not_found_GTEX_samples.txt", index=False)
Problem
When using Gen3-DRS option, no information is provided if a file in the input.csv is not present in the manifest.json file.
Solution
Add a logging file with the file names in the input.csv that are not present in the manifest.json file
Implementation
This should be implemented in the
filter_manifest.py
helper script by comparing the file names the resulting filtered manifest with the original input.csv fileThis has been tried in tag
Simplify-Gen3-DRS-7
, corresponding to the failing run https://cloudos.lifebit.ai/public/jobs/6203e3cb91203701dcbcb686