set of helper tools for the assembly of the different elements in the RELECOV platform (Spanish Network for genomic surveillance of SARS-Cov-2) as data download, processing, validation and upload to public databases, as well as analysis runs and database storage.
GNU General Public License v3.0
5
stars
21
forks
source link
Read-lab-metadata search for files recursively when no samples_data is given #318
Recent changes made samples_data.json file not required for read-lab-metadata module (this file is generated by "download" module). This means that every sample will pass the file integrity filter and "fastq_filepath" fields will be saved as the folder where the provided metadata file is located.
It would be nice if the module would make use of os.walk() to recursively search for files named ["sequence_file_RX_fastq"] and use this information to create a samples_data.json which includes the paths and md5s for each sample's files, removing those samples with any issue from the final json (like corrupted files, non-existing files or md5 mismatch if given)
Recent changes made
samples_data.json
file not required forread-lab-metadata
module (this file is generated by "download" module). This means that every sample will pass the file integrity filter and "fastq_filepath" fields will be saved as the folder where the provided metadata file is located.It would be nice if the module would make use of
os.walk()
to recursively search for files named ["sequence_file_RX_fastq"] and use this information to create asamples_data.json
which includes the paths and md5s for each sample's files, removing those samples with any issue from the final json (like corrupted files, non-existing files or md5 mismatch if given)