faircloth-lab / phyluce

software for UCE (and general) phylogenomics
http://phyluce.readthedocs.org/
Other
76 stars 48 forks source link

phyluce_probe_run_multiple_lastzs_sqlite issue #340

Open jovana03 opened 1 month ago

jovana03 commented 1 month ago

Hello,

I am working on designing a new UCE probe set. Two of my genome files are quite big (8.3 and 8.6 G), so when I try to align the temporary probe set against them I get the error:

IOError: Lastz returned: FAILURE: in new_position_table(), prev[] array size (6,140,087,216) exceeds allocation limit of 4,294,967,279; consider using lastz_32, or setting max_malloc_index for a special build, or breaking your target sequence into smaller pieces

So it seems that a good potential solution is split those genome files. I was looking for how to do it in some forums but the links that some people shared with apparently more detailed information are not available anymore.

Have someone had the same issue? How did you deal wit it?

Thanks. Regards, Jovana.

brantfaircloth commented 1 month ago

You can split your files using a program like faSplit from the UCSC tools: http://hgdownload.soe.ucsc.edu/admin/exe/. Those genome files are very, very large - I could probably think of other ways to process, but those may not be ideal.