Illumina / Isaac3

Aligner for sequencing data
Other
18 stars 2 forks source link

[Question]: Isaac index creation (grch38/m38) #4

Closed sklages closed 7 years ago

sklages commented 8 years ago

Hi,

creating indices for grch38 and grcm38 leaves some open questions:

I have run index creation as follows (mask-width 0 is the default, I just put it there as a "reminder" for future index creation runs):

isaac-sort-reference \
  --output-directory iSAACindex \
  --jobs 1 \
  --mask-width 0 \
  --genome-file genome.fa

That left me with exact 3 files and a 1.1TiB Temp folder:

-rw-rw-r-- 1 klages klages 618M 2016.08.26 01:05:08 2repeatness.8bpb.gz
-rw-rw-r-- 1 klages klages 678M 2016.08.25 22:19:13 2uniqueness.8bpb.gz
-rw-rw-r-- 1 klages klages 108K 2016.08.26 01:05:09 sorted-reference.xml
drwxrwxr-x 2 klages klages 8.0K 2016.08.26 01:05:09 Temp

make reported

[all]    INFO: All done!

At least it is "packable" by isaac-pack-reference.

Comparing to hg19-packed-reference.tar.gz from BaseSpace which shows:

-rwxr-x--- rpetrovski/aladdin 644685308 2014-11-19 21:38 2uniqueness.16bpb.gz
-rw-r--r-- rpetrovski/aladdin 386961748 2014-11-20 13:03 neighbors-1or2-16.1bpb
-rw-r--r-- rpetrovski/aladdin 386961748 2014-11-20 13:06 neighbors-1or2-32.1bpb
-rwxr-xr-- rpetrovski/aladdin 3157608038 2014-11-20 12:53 genome.fa
-rw-r--r-- rpetrovski/aladdin      48044 2014-11-20 12:54 sorted-reference.xml
isaac-sort-reference --version
iSAAC-03.16.06.06

best, Sven

rpetrovski commented 8 years ago

Neighbour files are needed to properly unpack isaac2 reference. If you simply untar it, you will not be able to run isaac2 with it. 2repeatness is not in isaac2 reference. Temp folder can be safely deleted once sort reference is done.

R.

On 1 Sep 2016 13:44, "sklages" notifications@github.com wrote:

Hi,

creating indices for grch38 and grcm38 leaves some open questions:

I have run index creation as follows (mask-width 0 is the default, I just put it there as a "reminder" for future index creation runs):

isaac-sort-reference \ --output-directory iSAACindex \ --jobs 1 \ --mask-width 0 \ --genome-file genome.fa

That left me with exact 3 files and a 1.1TiB Temp folder:

-rw-rw-r-- 1 klages klages 618M 2016.08.26 01:05:08 2repeatness.8bpb.gz -rw-rw-r-- 1 klages klages 678M 2016.08.25 22:19:13 2uniqueness.8bpb.gz -rw-rw-r-- 1 klages klages 108K 2016.08.26 01:05:09 sorted-reference.xml drwxrwxr-x 2 klages klages 8.0K 2016.08.26 01:05:09 Temp

make reported

[all] INFO: All done!

At least it is "packable" by isaac-pack-reference.

Comparing to hg19-packed-reference.tar.gz from BaseSpace which shows:

-rwxr-x--- rpetrovski/aladdin 644685308 2014-11-19 21:38 2uniqueness.16bpb.gz -rw-r--r-- rpetrovski/aladdin 386961748 2014-11-20 13:03 neighbors-1or2-16.1bpb -rw-r--r-- rpetrovski/aladdin 386961748 2014-11-20 13:06 neighbors-1or2-32.1bpb -rwxr-xr-- rpetrovski/aladdin 3157608038 2014-11-20 12:53 genome.fa -rw-r--r-- rpetrovski/aladdin 48044 2014-11-20 12:54 sorted-reference.xml

  • Is that a complete and valid index??
  • Do I still need Temp for any task after index creation?
  • What are the differences compared to isaac2 indices?
  • would be nice to have some grch38/grcm38 indexes on BaseSpace

isaac-sort-reference --version iSAAC-03.16.06.06

best, Sven

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Illumina/Isaac3/issues/4, or mute the thread https://github.com/notifications/unsubscribe-auth/AC8scckFPQ84Z_gnoELyByJRH24U-tTxks5qlsingaJpZM4JypZZ .