thierrygosselin / stackr

stackr: an R package to run stacks software pipeline
http://thierrygosselin.github.io/stackr/
28 stars 8 forks source link

Some problem around colony input files #18

Closed jblamyatifremer closed 7 years ago

jblamyatifremer commented 7 years ago

Dear Thierry,

I am using your package to input a large amount of data to Colony. Our package is the unique way to pass data from stack to colony.

1- First, i use a awk script to filtred out locus without polymorphisms awk '$2 > 0 {print}' "$src_root"/14_reassignation/input/batch_2.haplotypes.tsv > "$src_root"/14_reassignation/input/TRIM.haplotypes.tsv

2- I use your R package to feed the haplo2colony fonction

res <- haplo2colony("/media/XXX/TRIM.haplotypes.tsv" , blacklist.id = NULL, whitelist.loci = NULL, sample.markers = 5, 1, 2, pop.select = "all", allele.freq = FALSE, inbreeding = 0, mating.sys.males = 0, mating.sys.females = 0, clone = 0, run.length = 2, analysis = 1, allelic.dropout = 0, error.rate = 0.02, print.all.colony.opt = FALSE, imputations = FALSE, imputations.group = "populations", num.tree = 100, iteration.rf = 10, split.number = 100, verbose = TRUE, parallel.core = 2, filename = "/home/XXX/colony/colony2_v1.dat") 3- I rename the colony2_v1.dat to colony2.dat into the colony directory 4- I got an error when using colony2s.ifort.out with the jean-baptiste@ordi[colony] mv ./colony2_v1.dat ./colony2.dat [ 6:43] jean-baptiste@ordi[colony] ./colony2s.ifort.out [ 6:43]

COLONY, Version 2.0.6.2, Build 20160825, Expire Date 20180825 Copyright (C) by Jinliang Wang, Institute of Zoology, Zoological Society of London Email: jinliang.wang@ioz.ac.uk

Opening & reading data input file: colony2.dat Marker 2 has the same ID, 169, as marker 1 Errors in DATA. Insufficient data or incorrect format. Please check DATA and format and then re-run the program Program stopped in subroutine StopOnDataError

5- After looking into the colony manual user, in the attached file (i modified the extension) colony2.txt line 23, the loci name (header) is duplicated... After deleting all duplicates by hand I got a new (and more severe error). :

jean-baptiste@ordi[jean-baptiste] cd ~/colony [ 6:36] jean-baptiste@ordi[colony] ./colony2s.ifort.out [ 6:36]

COLONY, Version 2.0.6.2, Build 20160825, Expire Date 20180825 Copyright (C) by Jinliang Wang, Institute of Zoology, Zoological Society of London Email: jinliang.wang@ioz.ac.uk

Opening & reading data input file: colony2.dat Reading offspring genotype data... forrtl: Is a directory forrtl: severe (30): open failure, unit 10, file /home/jean-baptiste/colony/ Image PC Routine Line Source
colony2s.ifort.ou 0000000000633E04 Unknown Unknown Unknown colony2s.ifort.ou 00000000006493AB Unknown Unknown Unknown colony2s.ifort.ou 000000000042AE18 Unknown Unknown Unknown colony2s.ifort.ou 0000000000423E26 Unknown Unknown Unknown colony2s.ifort.ou 0000000000401EF6 Unknown Unknown Unknown colony2s.ifort.ou 0000000000401E7E Unknown Unknown Unknown colony2s.ifort.ou 00000000006E47A4 Unknown Unknown Unknown

Since colony2 inputs are quite plainfull to build-up, i will be very happy to have any inputs.

JB

thierrygosselin commented 7 years ago

Bonjour JB, je regarde le problème en fin de journée, heure du Québec. merci! Thierry

thierrygosselin commented 7 years ago

stackr v.0.4.6: haplo2colony is now deprecated in favour of write_colony that uses more sophisticated codes that enables more input file formats. See the new version commit, or the function description for details.

try something simple first, with your batch_2.haplotypes.tsv directly (the function takes care of monomorphic markers. e.g.:

setwd("/media/XXX/")

test <- stackr::write_colony(data = "TRIM.haplotypes.tsv", strata = "you need a strata file here")

The strata file is described in the argument of the function write_colony, it's basically a STACKS population.map with headers.

Look at the new colony file in your working directory. Try that one with COLONY if it works, then you can try out different argument parameters and imputations.

Cheers Thierry

jblamyatifremer commented 7 years ago

Dear Thierry,

I have tried your new stackR version, thanks for the quick response.

I got an error using your newest function write_colony (i provide my vcf and tsv file). I quickly look into your R code to find where the problem could come from... I did not find yet. I should spend more time on this (after 6 of december). Maybe you will be quicker.

I tried both type of input (.vcf and .tsv) with population map following the stack format. Below the outputs :

strataa <- read.table("/media/jean-baptiste/Passport0_5/002_PROJETS_CODES/RADSeq_HYSEA/14_reassignation/input/population_strata.csv", sep=",",stringsAsFactors = FALSE,header=TRUE)

write_colony("/media/jean-baptiste/Passport0_5/002_PROJETS_CODES/RADSeq_HYSEA/07_mendel_error_exploration/output/batch_2_3.vcf"

  • , strata = strataa, pop.levels = NULL, pop.labels = NULL,
  • blacklist.id = NULL, blacklist.genotype = NULL,
  • whitelist.markers = NULL, monomorphic.out = TRUE, snp.ld = NULL,
  • common.markers = TRUE, maf.thresholds = NULL, maf.pop.num.threshold = 1,
  • maf.approach = "SNP", maf.operator = "OR", max.marker = NULL,
  • sample.markers = NULL, pop.select = "all", allele.freq = "overall",
  • inbreeding = 0, mating.sys.males = 0, mating.sys.females = 0,
  • clone = 0, run.length = 1, analysis = 1, allelic.dropout = 0,
  • error.rate = 0.02, print.all.colony.opt = FALSE,
  • imputation.method = NULL, impute = "genotype",
  • imputations.group = "populations", num.tree = 100, iteration.rf = 10,
  • split.number = 100, verbose = TRUE,
  • parallel.core = parallel::detectCores() - 1, filename = "/home/jean-baptiste/colony/colony2_v2.dat") ####################################################################### ######################## stackr::write_colony ######################## ####################################################################### File type: vcf.file Importing data... Error in if (biallelic > 4) { : missing value where TRUE/FALSE needed

write_colony("/media/jean-baptiste/Passport0_5/002_PROJETS_CODES/RADSeq_HYSEA/14_reassignation/input/TRIM.haplotypes.tsv"

  • , strata = strataa, pop.levels = NULL, pop.labels = NULL,
  • blacklist.id = NULL, blacklist.genotype = NULL,
  • whitelist.markers = NULL, monomorphic.out = TRUE, snp.ld = NULL,
  • common.markers = TRUE, maf.thresholds = NULL, maf.pop.num.threshold = 1,
  • maf.approach = "SNP", maf.operator = "OR", max.marker = NULL,
  • sample.markers = NULL, pop.select = "all", allele.freq = "overall",
  • inbreeding = 0, mating.sys.males = 0, mating.sys.females = 0,
  • clone = 0, run.length = 1, analysis = 1, allelic.dropout = 0,
  • error.rate = 0.02, print.all.colony.opt = FALSE,
  • imputation.method = NULL, impute = "genotype",
  • imputations.group = "populations", num.tree = 100, iteration.rf = 10,
  • split.number = 100, verbose = TRUE,
  • parallel.core = parallel::detectCores() - 1, filename = "/home/jean-baptiste/colony/colony2_v2.dat") ####################################################################### ######################## stackr::write_colony ######################## ####################################################################### File type: haplo.file Importing data... Error in enc2utf8(col_names(col_labels, sep = sep)) : argumemt is not a character vector

I juste change the extension from .tsv to .txt (the vcf file is to big) (Github only accepts conventional extensions

for attached files).

Cheers, JB

PS : N'hesite pas à m'envoyer un MP si tu veux des precisions. Je peux t'envoyer le vcf sur filesender. TRIM.haplotypes.txt

thierrygosselin commented 7 years ago

Ok send me the strata file please I'll test your haplotypes file

jblamyatifremer commented 7 years ago

When attaching the strata file to the message, I realize that a csv format ! But when after importation in R it does not matter if csv or txt.

there is basically one population.

Have good day (rather night).

JB

On 2016-12-05 22:25, Thierry Gosselin wrote:

Ok send me the strata file please I'll test your haplotypes file

-- You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub [1], or mute the thread [2].

Links:

[1] https://github.com/thierrygosselin/stackr/issues/18#issuecomment-264981808 [2] https://github.com/notifications/unsubscribe-auth/AMn3GiSkMPbKNHWDAG_mP4use-pDYH2oks5rFIEwgaJpZM4K-EUW INDIVIDUALS,STRATA HY008-001_L1_AAACAA,1 HY008-002_L1_ACGTCA,1 HY008-003_L1_CGTCAG,1 HY008-004_L1_CTTATC,1 HY008-005_L1_GAAGTC,1 HY008-007_L1_GCGAGC,1 HY008-008_L1_TGCCCA,1 HY008-010_L1_TTGGCC,1 HY008-011_L1_AAGGG,1 HY008-012_L1_ACAAT,1 HY008-013_L1_CGGAC,1 HY008-014_L1_CTAGA,1 HY008-015_L1_GATAA,1 HY008-016_L1_GCCGC,1 HY008-017_L1_TGTGT,1 HY008-018_L1_TTCAG,1 HY008-019_L1_AGTAGA,1 HY008-020_L1_ATGAAG,1 HY008-021_L1_CAAAGG,1 HY008-022_L1_CCCAAA,1 HY008-023_L1_GGTTCC,1 HY008-024_L1_GTTGGG,1 HY008-025_L1_TACGAG,1 HY008-028_L1_TCCTTC,1 HY008-029_L1_AGCTA,1 HY008-030_L1_ATTCC,1 HY008-032_L1_CACCT,1 HY008-033_L1_CCTTG,1 HY008-034_L1_GGACG,1 HY008-035_L1_GTGTT,1 HY008-039_L1_TAATC,1 HY008-040_L1_TCGCA,1 HY008-009_L2_AAACAA,1 HY008-042_L2_ACGTCA,1 HY008-043_L2_CGTCAG,1 HY008-044_L2_CTTATC,1 HY008-045_L2_GAAGTC,1 HY008-046_L2_GCGAGC,1 HY008-101_L2_TGCCCA,1 HY008-102_L2_TTGGCC,1 HY008-103_L2_AAGGG,1 HY008-104_L2_ACAAT,1 HY008-105_L2_CGGAC,1 HY008-106_L2_CTAGA,1 HY008-107_L2_GATAA,1 HY008-109_L2_GCCGC,1 HY008-110_L2_TGTGT,1 HY008-111_L2_TTCAG,1 HY008-114_L2_AGTAGA,1 HY008-116_L2_ATGAAG,1 HY009-003_L2_CAAAGG,1 HY009-004_L2_CCCAAA,1 HY009-009_L2_GGTTCC,1 HY009-014_L2_GTTGGG,1 HY009-018_L2_TACGAG,1 HY009-021_L2_TCCTTC,1 HY009-027_L2_AGCTA,1 HY009-029_L2_ATTCC,1 HY009-041_L2_CACCT,1 HY009-047_L2_CCTTG,1 HY009-048_L2_GGACG,1 HY009-050_L2_GTGTT,1 HY009-101_L2_TAATC,1 HY009-102_L2_TCGCA,1 HY009-103_L3_AAACAA,1 HY009-104_L3_ACGTCA,1 HY009-105_L3_CGTCAG,1 HY009-107_L3_CTTATC,1 HY009-108_L3_GAAGTC,1 HY009-109_L3_GCGAGC,1 HY009-111_L3_TGCCCA,1 HY009-112_L3_TTGGCC,1 HY009-116_L3_AAGGG,1 HY009-118_L3_ACAAT,1 HY009-120_L3_CGGAC,1 HY009-121_L3_CTAGA,1 HY009-122_L3_GATAA,1 HY009-123_L3_GCCGC,1 HY009-124_L3_TGTGT,1 HY009-125_L3_TTCAG,1 HY009-126_L3_AGTAGA,1 HY009-129_L3_ATGAAG,1 HY009-130_L3_CAAAGG,1 HY009-132_L3_CCCAAA,1 HY009-133_L3_GGTTCC,1 HY009-135_L3_GTTGGG,1 HY009-136_L3_TACGAG,1 HY009-138_L3_TCCTTC,1 HY009-140_L3_AGCTA,1 HY009-141_L3_ATTCC,1 HY009-143_L3_CACCT,1 HY009-144_L3_CCTTG,1 HY009-145_L3_GGACG,1 HY009-146_L3_GTGTT,1 HY009-147_L3_TAATC,1 HY009-149_L3_TCGCA,1 HY009-150_L4_AAACAA,1 HY009-151_L4_ACGTCA,1 HY009-152_L4_CGTCAG,1 HY009-153_L4_CTTATC,1 HY013-001_L4_GAAGTC,1 HY013-002_L4_GCGAGC,1 HY013-003_L4_TGCCCA,1 HY013-004_L4_TTGGCC,1 HY013-005_L4_AAGGG,1 HY013-006_L4_ACAAT,1 HY013-007_L4_CGGAC,1 HY013-008_L4_CTAGA,1 HY013-009_L4_GATAA,1 HY013-010_L4_GCCGC,1 HY013-011_L4_TGTGT,1 HY013-016_L4_TTCAG,1 HY013-017_L4_AGTAGA,1 HY013-019_L4_ATGAAG,1 HY013-020_L4_CAAAGG,1 HY013-021_L4_CCCAAA,1 HY013-022_L4_GGTTCC,1 HY013-023_L4_GTTGGG,1 HY013-024_L4_TACGAG,1 HY013-025_L4_TCCTTC,1 HY013-026_L4_AGCTA,1 HY013-027_L4_ATTCC,1 HY013-028_L4_CACCT,1 HY013-029_L4_CCTTG,1 HY013-030_L4_GGACG,1 HY013-031_L4_GTGTT,1 HY013-032_L4_TAATC,1 HY013-033_L4_TCGCA,1 HY013-034_L5_AAACAA,1 HY013-035_L5_ACGTCA,1 HY013-036_L5_CGTCAG,1 HY013-037_L5_CTTATC,1 HY013-038_L5_GAAGTC,1 HY013-039_L5_GCGAGC,1 HY013-040_L5_TGCCCA,1 HY013-041_L5_TTGGCC,1 HY013-042_L5_AAGGG,1 HY013-043_L5_ACAAT,1 HY013-044_L5_CGGAC,1 HY013-045_L5_CTAGA,1 HY013-046_L5_GATAA,1 HY013-047_L5_GCCGC,1 HY013-048_L5_TGTGT,1 HY013-049_L5_TTCAG,1 HY013-050_L5_AGTAGA,1 HY013-101_L5_ATGAAG,1 HY013-102_L5_CAAAGG,1 HY013-103_L5_CCCAAA,1 HY013-105_L5_GGTTCC,1 HY013-106_L5_GTTGGG,1 HY018-001_L5_TACGAG,1 HY018-002_L5_TCCTTC,1 HY018-003_L5_AGCTA,1 HY018-004_L5_ATTCC,1 HY018-005_L5_CACCT,1 HY018-006_L5_CCTTG,1 HY018-007_L5_GGACG,1 HY018-008_L5_GTGTT,1 HY018-009_L5_TAATC,1 HY018-010_L5_TCGCA,1 HY018-011_L6_AAACAA,1 HY018-012_L6_ACGTCA,1 HY018-013_L6_CGTCAG,1 HY018-014_L6_CTTATC,1 HY018-015_L6_GAAGTC,1 HY018-016_L6_GCGAGC,1 HY018-017_L6_TGCCCA,1 HY018-018_L6_TTGGCC,1 HY018-019_L6_AAGGG,1 HY018-020_L6_ACAAT,1 HY018-021_L6_CGGAC,1 HY018-022_L6_CTAGA,1 HY018-023_L6_GATAA,1 HY018-024_L6_GCCGC,1 HY018-025_L6_TGTGT,1 HY018-026_L6_TTCAG,1 HY018-027_L6_AGTAGA,1 HY018-028_L6_ATGAAG,1 HY018-029_L6_CAAAGG,1 HY018-030_L6_CCCAAA,1 HY018-031_L6_GGTTCC,1 HY018-032_L6_GTTGGG,1 HY018-033_L6_TACGAG,1 HY018-034_L6_TCCTTC,1 HY018-035_L6_AGCTA,1 HY018-036_L6_ATTCC,1 HY018-037_L6_CACCT,1 HY018-038_L6_CCTTG,1 HY018-039_L6_GGACG,1 HY018-040_L6_GTGTT,1 HY018-041_L6_TAATC,1 HY018-042_L6_TCGCA,1 HY018-043_L7_AAACAA,1 HY018-044_L7_ACGTCA,1 HY018-045_L7_CGTCAG,1 HY018-046_L7_CTTATC,1 HY018-047_L7_GAAGTC,1 HY018-048_L7_GCGAGC,1 HY018-049_L7_TGCCCA,1 HY018-050_L7_TTGGCC,1 HY031-001_L7_AAGGG,1 HY031-002_L7_ACAAT,1 HY031-003_L7_CGGAC,1 HY031-004_L7_CTAGA,1 HY031-005_L7_GATAA,1 HY031-006_L7_GCCGC,1 HY031-007_L7_TGTGT,1 HY031-008_L7_TTCAG,1 HY031-009_L7_AGTAGA,1 HY031-010_L7_ATGAAG,1 HY031-011_L7_CAAAGG,1 HY031-012_L7_CCCAAA,1 HY031-013_L7_GGTTCC,1 HY031-014_L7_GTTGGG,1 HY031-015_L7_TACGAG,1 HY031-016_L7_TCCTTC,1 HY031-017_L7_AGCTA,1 HY031-018_L7_ATTCC,1 HY031-019_L7_CACCT,1 HY031-020_L7_CCTTG,1 HY031-021_L7_GGACG,1 HY031-022_L7_GTGTT,1 HY031-023_L7_TAATC,1 HY031-024_L7_TCGCA,1 HY031-025_L8_AAACAA,1 HY031-026_L8_ACGTCA,1 HY031-027_L8_CGTCAG,1 HY031-028_L8_CTTATC,1 HY031-029_L8_GAAGTC,1 HY031-030_L8_GCGAGC,1 HY031-031_L8_TGCCCA,1 HY031-032_L8_TTGGCC,1 HY031-033_L8_AAGGG,1 HY031-034_L8_ACAAT,1 HY031-035_L8_CGGAC,1 HY031-036_L8_CTAGA,1 HY031-037_L8_GATAA,1 HY031-038_L8_GCCGC,1 HY031-039_L8_TGTGT,1 HY031-040_L8_TTCAG,1 HY031-041_L8_AGTAGA,1 HY031-042_L8_ATGAAG,1 HY031-043_L8_CAAAGG,1 HY031-044_L8_CCCAAA,1 HY031-045_L8_GGTTCC,1 HY031-046_L8_GTTGGG,1 HY031-047_L8_TACGAG,1 HY031-048_L8_TCCTTC,1 HY031-049_L8_AGCTA,1 HY031-050_L8_ATTCC,1 HY032-001_L8_CACCT,1 HY032-003_L8_CCTTG,1 HY032-004_L8_GGACG,1 HY032-005_L8_GTGTT,1 HY032-006_L8_TAATC,1 HY032-007_L8_TCGCA,1 HY032-008_L9_AAACAA,1 HY032-009_L9_ACGTCA,1 HY032-010_L9_CGTCAG,1 HY032-011_L9_CTTATC,1 HY032-012_L9_GAAGTC,1 HY032-013_L9_GCGAGC,1 HY032-014_L9_TGCCCA,1 HY032-015_L9_TTGGCC,1 HY032-016_L9_AAGGG,1 HY032-017_L9_ACAAT,1 HY032-018_L9_CGGAC,1 HY032-019_L9_CTAGA,1 HY032-020_L9_GATAA,1 HY032-021_L9_GCCGC,1 HY032-022_L9_TGTGT,1 HY032-023_L9_TTCAG,1 HY032-024_L9_AGTAGA,1 HY032-025_L9_ATGAAG,1 HY032-026_L9_CAAAGG,1 HY032-027_L9_CCCAAA,1 HY032-028_L9_GGTTCC,1 HY032-029_L9_GTTGGG,1 HY032-030_L9_TACGAG,1 HY032-031_L9_TCCTTC,1 HY032-032_L9_AGCTA,1 HY032-033_L9_ATTCC,1 HY032-034_L9_CACCT,1 HY032-035_L9_CCTTG,1 HY032-036_L9_GGACG,1 HY032-037_L9_GTGTT,1 HY032-038_L9_TAATC,1 HY032-039_L9_TCGCA,1 HY032-041_L10_AAACAA,1 HY032-042_L10_ACGTCA,1 HY032-043_L10_CGTCAG,1 HY032-044_L10_CTTATC,1 HY032-045_L10_GAAGTC,1 HY032-046_L10_GCGAGC,1 HY032-047_L10_TGCCCA,1 HY032-048_L10_TTGGCC,1 HY032-049_L10_AAGGG,1 HY032-050_L10_ACAAT,1 HY032-101_L10_CGGAC,1 HY032-102_L10_CTAGA,1 HY034-001_L10_GATAA,1 HY034-002_L10_GCCGC,1 HY034-003_L10_TGTGT,1 HY034-004_L10_TTCAG,1 HY034-005_L10_AGTAGA,1 HY034-006_L10_ATGAAG,1 HY034-007_L10_CAAAGG,1 HY034-008_L10_CCCAAA,1 HY034-009_L10_GGTTCC,1 HY034-010_L10_GTTGGG,1 HY034-011_L10_TACGAG,1 HY034-012_L10_TCCTTC,1 HY034-013_L10_AGCTA,1 HY034-014_L10_ATTCC,1 HY034-015_L10_CACCT,1 HY034-016_L10_CCTTG,1 HY034-017_L10_GGACG,1 HY034-018_L10_GTGTT,1 HY034-019_L10_TAATC,1 HY034-020_L10_TCGCA,1 HY034-021_L11_AAACAA,1 HY034-022_L11_ACGTCA,1 HY034-023_L11_CGTCAG,1 HY034-024_L11_CTTATC,1 HY034-025_L11_GAAGTC,1 HY034-026_L11_GCGAGC,1 HY034-027_L11_TGCCCA,1 HY034-028_L11_TTGGCC,1 HY034-029_L11_AAGGG,1 HY034-030_L11_ACAAT,1 HY034-031_L11_CGGAC,1 HY034-032_L11_CTAGA,1 HY034-033_L11_GATAA,1 HY034-035_L11_GCCGC,1 HY034-036_L11_TGTGT,1 HY034-037_L11_TTCAG,1 HY034-038_L11_AGTAGA,1 HY034-039_L11_ATGAAG,1 HY034-040_L11_CAAAGG,1 HY034-041_L11_CCCAAA,1 HY034-042_L11_GGTTCC,1 HY034-043_L11_GTTGGG,1 HY034-044_L11_TACGAG,1 HY034-045_L11_TCCTTC,1 HY034-046_L11_AGCTA,1 HY034-047_L11_ATTCC,1 HY034-048_L11_CACCT,1 HY034-049_L11_CCTTG,1 HY034-050_L11_GGACG,1 HY034-101_L11_GTGTT,1 HY036-001_L11_TAATC,1 HY036-002_L11_TCGCA,1 HY036-003_L12_AAACAA,1 HY036-004_L12_ACGTCA,1 HY036-005_L12_CGTCAG,1 HY036-006_L12_CTTATC,1 HY036-007_L12_GAAGTC,1 HY036-008_L12_GCGAGC,1 HY036-009_L12_TGCCCA,1 HY036-010_L12_TTGGCC,1 HY036-011_L12_AAGGG,1 HY036-012_L12_ACAAT,1 HY036-013_L12_CGGAC,1 HY036-014_L12_CTAGA,1 HY036-015_L12_GATAA,1 HY036-016_L12_GCCGC,1 HY036-017_L12_TGTGT,1 HY036-018_L12_TTCAG,1 HY036-019_L12_AGTAGA,1 HY036-020_L12_ATGAAG,1 HY036-021_L12_CAAAGG,1 HY036-022_L12_CCCAAA,1 HY036-023_L12_GGTTCC,1 HY036-024_L12_GTTGGG,1 HY036-025_L12_TACGAG,1 HY036-026_L12_TCCTTC,1 HY036-027_L12_AGCTA,1 HY036-028_L12_ATTCC,1 HY036-029_L12_CACCT,1 HY036-030_L12_CCTTG,1 HY036-031_L12_GGACG,1 HY036-032_L12_GTGTT,1 HY036-033_L12_TAATC,1 HY036-034_L12_TCGCA,1 HY036-035_L13_AAACAA,1 HY036-036_L13_ACGTCA,1 HY036-037_L13_CGTCAG,1 HY036-038_L13_CTTATC,1 HY036-039_L13_GAAGTC,1 HY036-040_L13_GCGAGC,1 HY036-041_L13_TGCCCA,1 HY036-042_L13_TTGGCC,1 HY036-043_L13_AAGGG,1 HY036-044_L13_ACAAT,1 HY036-045_L13_CGGAC,1 HY036-046_L13_CTAGA,1 HY036-047_L13_GATAA,1 HY036-048_L13_GCCGC,1 HY48_001_L13_TGTGT,1 HY48_002_L13_TTCAG,1 HY48_003_L13_AGTAGA,1 HY48_004_L13_ATGAAG,1 HY48_005_L13_CAAAGG,1 HY48_006_L13_CCCAAA,1 HY48_007_L13_GGTTCC,1 HY48_008_L13_GTTGGG,1 HY48_009_L13_TACGAG,1 HY48_010_L13_TCCTTC,1 HY48_011_L13_AGCTA,1 HY48_012_L13_ATTCC,1 HY48_013_L13_CACCT,1 HY48_014_L13_CCTTG,1 HY48_015_L13_GGACG,1 HY48_016_L13_GTGTT,1 HY48_017_L13_TAATC,1 HY48_018_L13_TCGCA,1 HY48_019_L14_AAACAA,1 HY48_020_L14_ACGTCA,1 HY48_021_L14_CGTCAG,1 HY48_022_L14_CTTATC,1 HY48_023_L14_GAAGTC,1 HY48_024_L14_GCGAGC,1 HY48_025_L14_TGCCCA,1 HY48_026_L14_TTGGCC,1 HY48_027_L14_AAGGG,1 HY48_028_L14_ACAAT,1 HY48_029_L14_CGGAC,1 HY48_030_L14_CTAGA,1 HY48_031_L14_GATAA,1 HY48_032_L14_GCCGC,1 HY48_033_L14_TGTGT,1 HY48_034_L14_TTCAG,1 HY48_035_L14_AGTAGA,1 HY48_036_L14_ATGAAG,1 HY48_037_L14_CAAAGG,1 HY48_038_L14_CCCAAA,1 HY48_039_L14_GGTTCC,1 HY48_040_L14_GTTGGG,1 HY48_042_L14_TACGAG,1 HY48_043_L14_TCCTTC,1 HY48_044_L14_AGCTA,1 HY48_045_L14_ATTCC,1 HY48_046_L14_CACCT,1 HY48_047_L14_CCTTG,1 HY48_048_L14_GGACG,1 HY48_049_L14_GTGTT,1 HY48_050_L14_TAATC,1 HY59_001_L14_TCGCA,1 HY59_002_L15_AAACAA,1 HY59_003_L15_ACGTCA,1 HY59_004_L15_CGTCAG,1 HY59_005_L15_CTTATC,1 HY59_006_L15_GAAGTC,1 HY59_007_L15_GCGAGC,1 HY59_008_L15_TGCCCA,1 HY59_009_L15_TTGGCC,1 HY59_010_L15_AAGGG,1 HY59_0101_L15_ACAAT,1 HY59_0102_L15_CGGAC,1 HY59_0103_L15_CTAGA,1 HY59_0104_L15_GATAA,1 HY59_0106_L15_GCCGC,1 HY59_011_L15_TGTGT,1 HY59_012_L15_TTCAG,1 HY59_013_L15_AGTAGA,1 HY59_014_L15_ATGAAG,1 HY59_015_L15_CAAAGG,1 HY59_016_L15_CCCAAA,1 HY59_017_L15_GGTTCC,1 HY59_018_L15_GTTGGG,1 HY59_019_L15_TACGAG,1 HY59_020_L15_TCCTTC,1 HY59_021_L15_AGCTA,1 HY59_022_L15_ATTCC,1 HY59_023_L15_CACCT,1 HY59_024_L15_CCTTG,1 HY59_025_L15_GGACG,1 HY59_026_L15_GTGTT,1 HY59_027_L15_TAATC,1 HY59_028_L15_TCGCA,1 HY59_029_L16_AAACAA,1 HY59_030_L16_ACGTCA,1 HY59_032_L16_CGTCAG,1 HY59_033_L16_CTTATC,1 HY59_034_L16_GAAGTC,1 HY59_035_L16_GCGAGC,1 HY59_036_L16_TGCCCA,1 HY59_037_L16_TTGGCC,1 HY59_038_L16_AAGGG,1 HY59_039_L16_ACAAT,1 HY59_040_L16_CGGAC,1 HY59_042_L16_CTAGA,1 HY59_043_L16_GATAA,1 HY59_044_L16_GCCGC,1 HY59_046_L16_TGTGT,1 HY59_047_L16_TTCAG,1 G4A4_F2_L16_AGTAGA,1 G5A5_F1_L16_ATGAAG,1 A7G7_F1_L16_CAAAGG,1 A9G9_F2_L16_CCCAAA,1 A9G9_M1_L16_GGTTCC,1 G1A1_M1_L16_GTTGGG,1 G3A3_M2_L16_TACGAG,1 G5A5_M2_L16_TCCTTC,1 G_M4_L16_AGCTA,1 G_M5_L16_ATTCC,1 A_M2_L16_CACCT,1 A_M4_L16_CCTTG,1 hy38f2_L16_GGACG,1 hy38m2_L16_GTGTT,1 hy40f4_L16_TAATC,1 hy40m1_L16_TCGCA,1

thierrygosselin commented 7 years ago

Hi JB,

Try:

Test1: as mentioned above

test <- stackr::write_colony(data = "TRIM.haplotypes.tsv", strata = "you need a strata file here")

This works or not ?

Test2 with strata file already in the global environment:

strata <- readr::read_tsv(file = "strata.test.colony.tsv")
stackr::write_colony(data = "TRIM.haplotypes.txt", strata = strata)

This should also work

Comments

Cheers Thierry

jblamyatifremer commented 7 years ago

Dear Thierry

Test1: as mentioned above test <- stackr::write_colony(data = "TRIM.haplotypes.tsv", strata = "you need a strata file here")

My previous command was messy with non-existent option as you highlighted. I was able to get colony input from the command above but Colony throw me an error (I do not have my linux station with me to reproduce the error message). Roughly, colony was complained about the format and the amount of data.

But i re-tried the same command with a ".vcf" from the same dataset (with the same strata file), No more error and Colony is still working on it. That is good...

Tomorow, i tried again to get a colony input with stackr from the ".tsv" to figure out what is going on.

I also see that you are involved in Mapcomp... I will use it very soon. bravo for your work it is very usefull !

Test2 with strata file already in the global environment: strata <- readr::read_tsv(file = "strata.test.colony.tsv") stackr::write_colony(data = "TRIM.haplotypes.txt", strata = strata)