dnanexus / UKB_RAP

Access share reviewed code & Jupyter Notebooks for use on the UK Biobank (UKBB) Research Application Platform. Includes resources from DNAnexus webinars, online trainings and workshops.
MIT License
106 stars 45 forks source link

Running a script over a bunch of vcf files #35

Open MaryGoAround opened 5 months ago

MaryGoAround commented 5 months ago

Hi

I am interested in individual-level WGS vcf files in "/mnt/project/Bulk/DRAGEN WGS/Whole genome variant call files (VCFs) (DRAGEN) [500k release]" for a list of participants in eid.txt. eid.txt

I want to locate these vcf files so I done as below but I get error

usr@LV19Y7325V dnanexus-upload-agent-1.5.33-osx %
 dx ls "/mnt/project/Bulk/DRAGEN WGS/Whole genome variant call files (VCFs) (DRAGEN) [500k release]" VCF/*gz > tempfile.txt ;
zsh: no matches found: VCF/*gz

I tried this command and got error too

usr@LV19Y7325V dnanexus-upload-agent-1.5.33-osx %
 for i in `cat eid.txt`; do dx find data --property eid=$i --folder "/mnt/project/Bulk/DRAGEN WGS/Whole genome variant call files (VCFs) (DRAGEN) [500k release]" ; done
cat: eid.txt: No such file or directory

Or, I tried this

usr@LV19Y7325V ~ % dx ls './*.gz*'                         
dxpy.utils.resolver.ResolutionError: Unable to resolve "*.gz*" to a data object or folder name in '/Bulk/DRAGEN WGS/Whole genome variant call files (VCFs) (DRAGEN) [500k release]'

Please could somebody help me with this?