Closed jolespin closed 3 years ago
It is possible to add this feature. I'm about to make some general changes to the code and can also include better handling of empty input files.
Hi, I am wondering if there is a time estimate on this feature request (handling empty input files), or if anyone have recommendations for temporary workarounds for use within pipelines.
Thank you for all your awesome work on DAS Tool!!
I have a for-loop somewhere that basically says:
s2b="" for FP in "list" "of' "scaffold2bins"; do echo $FP; if file is not empty; s2b += FP + "," else: dont do that
This is extremely pseudo-code but thats what I ended up doing. Not sure where the actual bash code is so I can't be much more help. Some stackoverflow should help with the details tho.
Hi @zoey-rw,
Like @jolespin proposed, a quick workaround is a script that checks the size of your input files. The following bash-code example uses two scaffold2bin files of the sample_data of this repo and includes one empty file (sample.human.gut_emptyBin_scaffolds2bin.tsv
):
# create empty file:
touch sample_data/sample.human.gut_emptyBin_scaffolds2bin.tsv
# define input files including empty file:
scaffoldstobins='sample_data/sample.human.gut_concoct_scaffolds2bin.tsv,sample_data/sample.human.gut_emptyBin_scaffolds2bin.tsv,sample_data/sample.human.gut_metabat_scaffolds2bin.tsv'
# check if any scaffold2bin files are emtpy:
s2b_tmp=''
for i in $(echo ${scaffoldstobins} | tr "," " ")
do
scaf2bin_wcl=$(cat ${i} | wc -l)
if [ "${scaf2bin_wcl}" -eq "0" ]
then
echo "Warning: scaffolds2bin file is empty: $i "
else
s2b_tmp=${s2b_tmp},${i}
fi
done
# remove initial ',':
s2b_tmp=${s2b_tmp#","}
# check if all scaffold2bin files are emtpy:
if [ "${#s2b_tmp}" -eq "0" ]
then
echo "Warning: All input files are empty."
else
scaffoldstobins=${s2b_tmp}
fi
echo $scaffoldstobins
Now, you would run DAS Tool with -i $scaffoldstobins
as input.
Alternatively, you can check out the new empty_input_file_fix branch, which is able to handle empty scaffold2bin files.
This feature has been merged into the master branch. Closing this ticket.
Is this in v1.1.3?
Yes.
What are your thoughts on having DASTool accept empty scaffolds to bins files? The reasoning for this is that it is typical to include DASTool in pipelines and sometimes binners don't find any bins while others do. The pipeline breaks because of this. Would DASTool be able to take this but then issue a warnings saying that the file is empty?