ropensci / phylotaR

An automated pipeline for retrieving orthologous DNA sequences from GenBank in R
https://docs.ropensci.org/phylotaR
Other
23 stars 8 forks source link

failing to run drop_by_rank() #48

Closed sly1983 closed 1 year ago

sly1983 commented 4 years ago

Hi, I am trying to reduce the sequence data by using drop_by_rank() but getting the following error:

Error in phylota@sqs@sqs[[i]] : attempt to select less than one element in get1index

I am running the pipeline according to the tutorial:

library(phylotaR) wd <- '/media/... ncbi_dr <- '/home/... txid <- 4747 setup(wd = wd, txid = txid, ncbi_dr = ncbi_dr, v=TRUE) run(wd = wd)

all_clusters <- read_phylota(wd) print(all_clusters) cids <- all_clusters@cids n_taxa <- get_ntaxa(phylota = all_clusters, cid = cids) keep <- cids[taxa_all > 10] selected <- drop_clstrs(phylota = all_clusters, cid = keep) reduced <- drop_by_rank(selected, rnk = 'species', n=1)

My experience with R is unfortunately quite limited and I appreciate the help. Thanks!

DomBennett commented 4 years ago

Hi!

What version of phylotaRare you using?

Did you install from CRAN, like this: install.packages("phylotaR") or from GitHub like this devtools::install_packages("ropensci/phylotaR")?

Updating to the latest version on GitHub might fix the issue for you. Try reinstalling the package, restarting R and running the code again.

DomBennett commented 4 years ago

Otherwise, given that you have such a large taxonomic group, 4747.

You could try to break down the clusters to a smaller subset to determine on which cluster or sequences the issue is arising.

You could drop a random subset of clusters, the largest clusters or the smallest clusters using the drop_clsters() function.

Or you could break the clusters down by taxonomic group... e.g. to break the clusters down to an individual genus you could:

library(phylotaR)

data("bromeliads")

# fetch the genus id for every sequence in the cluster
genus_ids <- get_txids(phylota = bromeliads, sid = bromeliads@sids,
                       rnk = 'genus')
# in genus_ids, each element refers to an NCBI taxon ID for a genus
# names(genus_ids), will show the labelled name attached to each element
# these names refer to the sequence ID in the cluster

# identify the unique genera
unique_genus_ids <- unique(genus_ids)
# drop empty IDs
unique_genus_ids <- unique_genus_ids[unique_genus_ids != '']
# convert genus IDs to names (if needed)
scientific_names <- get_tx_slot(phylota = bromeliads, txid = unique_genus_ids,
                                slt_nm = 'scnm')

# choose a genus...
# Option 1: .... by name
bool <- "Ananas" == scientific_names
annas_id <- names(scientific_names)[bool]
bool <- genus_ids == annas_id
seq_ids_to_keep <- names(genus_ids)[bool]

# Option 2: .... by number of sequences
freq_table <- sort(table(genus_ids), decreasing = TRUE)
bool <- genus_ids == names(freq_table)[1]
seq_ids_to_keep <- names(genus_ids)[bool]

# shrink down clusters to just the sequences of the specific genus
genus_clusters <- bromeliads
for (cid in genus_clusters@cids) {
  genus_clusters <- drop_sqs(phylota = genus_clusters, cid = cid,
                             sid = seq_ids_to_keep)
}
# drop all clusters with 0 seqs left
n_taxa <- get_ntaxa(phylota = genus_clusters, cid = genus_clusters@cids)
genus_clusters <- drop_clstrs(phylota = genus_clusters,
                              cid = genus_clusters@cids[n_taxa > 0])
# ^ try printing the genus_clusters to test the changes
sly1983 commented 4 years ago

Hi,

thanks a lot for the reply. Unfortunately after updating the version the issue is still there. Yes, the taxonomic group is huge. The second suggestion is working to check individual genera. But since there are hundreds of genera it is not really feasible to go through them one by one. Would there be a way to check this in a more automated fashion?

Thanks again for the help!

mrubenss commented 4 years ago

Hi,

I'm also getting the same error using drop_by_rank(). I've tried both proposals above without success.

I have no clue what may be causing this. Thank you so much for your assistance.

bheimbu commented 1 year ago

Hi,

I'm experiencing the very same error. I've installed the latest version from Github via remotes::install_github('ropensci/phylotaR'). Is there any solution to overcome this error, apart from the options provided above?

Cheers Bastian

ShixiangWang commented 1 year ago

@bheimbu I cannot reproduce the error with GitHub version and txid <- 9504. txid <- 4747 takes too much time to download, and it always is interrupted in my environment. What's the id are you working on? If you succeed in the run(), could you share the zipped directory?

ncbi_dr = "/workspaces/blast/bin/"
devtools::load_all()
wd = "/workspaces/test2"
#unlink(wd)

# library(phylotaR)
# wd <- '/media/...
# ncbi_dr <- '/home/...
# txid <- 4747
txid <- 9504
setup(wd = wd, txid = txid, ncbi_dr = ncbi_dr, v=TRUE, overwrite = TRUE)
run(wd = wd)

all_clusters <- read_phylota(wd)
print(all_clusters)
cids <- all_clusters@cids
n_taxa <- get_ntaxa(phylota = all_clusters, cid = cids)
keep <- cids[n_taxa >= 5]
selected <- drop_clstrs(phylota = all_clusters, cid = keep)
reduced <- drop_by_rank(selected, rnk = 'species', n=1)

My log:

r$> # Test example from issue https://github.com/ropensci/phylotaR/issues/48
    ncbi_dr = "/workspaces/blast/bin/"

r$> devtools::load_all()
ℹ Loading phylotaR
^[[200~wd = "/workspaces/test2"^[[201~

r$> wd = "/workspaces/test2"

r$> #unlink(wd)

    # library(phylotaR)
    # wd <- '/media/...
    # ncbi_dr <- '/home/...
    # txid <- 4747
    txid <- 9504

r$> setup(wd = wd, txid = txid, ncbi_dr = ncbi_dr, v=TRUE)
-------------------------------------------------
phylotaR: Implementation of PhyLoTa in R [v1.3.0]
-------------------------------------------------
Checking for valid NCBI BLAST+ Tools ...
Found: [/workspaces/blast/bin/makeblastdb]
Found: [/workspaces/blast/bin/blastn]
. . Running makeblastdb
Setting up pipeline with the following parameters:
. blstn          [/workspaces/blast/bin/blastn]
. btchsz         [100]
. date           [2023-01-03]
. db_only        [FALSE]
. mdlthrs        [3000]
. mkblstdb       [/workspaces/blast/bin/makeblastdb]
. mncvrg         [51]
. mnsql          [250]
. multiple_ids   [FALSE]
. mxevl          [1e-10]
. mxnds          [1e+05]
. mxrtry         [100]
. mxsql          [2000]
. mxsqs          [50000]
. ncps           [1]
. outfmt         [6 qseqid sseqid pident length evalue qcovs qcovhsp]
. outsider       [FALSE]
. srch_trm       [NOT predicted[TI] NOT "whole genome shotgun"[TI] NOT unverified[TI] NOT "synthetic construct"[Organism] NOT refseq[filter] NOT TSA[Keyword]]
. txid           [9504]
. v              [TRUE]
. wd             [/workspaces/test2]
. wt_tms         [1, 3, 6 ...]
Error in cache_setup(ps = ps, ovrwrt = overwrite) : 
  Cache already exists, ovrwrt = FALSE.

r$> setup(wd = wd, txid = txid, ncbi_dr = ncbi_dr, v=TRUE, overwrite = TRUE)
-------------------------------------------------
phylotaR: Implementation of PhyLoTa in R [v1.3.0]
-------------------------------------------------
Checking for valid NCBI BLAST+ Tools ...
Found: [/workspaces/blast/bin/makeblastdb]
Found: [/workspaces/blast/bin/blastn]
. . Running makeblastdb
Setting up pipeline with the following parameters:
. blstn          [/workspaces/blast/bin/blastn]
. btchsz         [100]
. date           [2023-01-03]
. db_only        [FALSE]
. mdlthrs        [3000]
. mkblstdb       [/workspaces/blast/bin/makeblastdb]
. mncvrg         [51]
. mnsql          [250]
. multiple_ids   [FALSE]
. mxevl          [1e-10]
. mxnds          [1e+05]
. mxrtry         [100]
. mxsql          [2000]
. mxsqs          [50000]
. ncps           [1]
. outfmt         [6 qseqid sseqid pident length evalue qcovs qcovhsp]
. outsider       [FALSE]
. srch_trm       [NOT predicted[TI] NOT "whole genome shotgun"[TI] NOT unverified[TI] NOT "synthetic construct"[Organism] NOT refseq[filter] NOT TSA[Keyword]]
. txid           [9504]
. v              [TRUE]
. wd             [/workspaces/test2]
. wt_tms         [1, 3, 6 ...]
-------------------------------------------------

r$> run(wd = wd)
---------------------------------------------------
Running pipeline on [unix] at [2023-01-03 05:09:10]
---------------------------------------------------
Running stages: taxise, download, cluster, cluster2
--------------------------------------------
Starting stage TAXISE: [2023-01-03 05:09:10]
--------------------------------------------
Searching taxonomic IDs ...
Downloading taxonomic records ...
. [1-22]
Generating taxonomic dictionary ...
---------------------------------------------
Completed stage TAXISE: [2023-01-03 05:09:16]
---------------------------------------------
----------------------------------------------
Starting stage DOWNLOAD: [2023-01-03 05:09:16]
----------------------------------------------
Identifying suitable clades ...
Identified [1] suitable clades.
Downloading hierarchically ...
Working on parent [id 9504]: [1/1] ...
. + direct ...
. + by child ...
. . Working on child [id 2688256]
. . + whole subtree ...
. . . Downloading [34 sqs] ...
. . . . [1-34]
. . . Working on child [id 1263727]
. . . + whole subtree ...
. . . . Downloading [3 sqs] ...
. . . . . [1-3]
. . . . Working on child [id 361674]
. . . . + whole subtree ...
. . . . . Downloading [1 sqs] ...
. . . . . . [1-1]
. . . . . Working on child [id 292213]
. . . . . + whole subtree ...
. . . . . . Downloading [100 sqs] ...
. . . . . . . [1-100]
. . . . . . Working on child [id 57176]
. . . . . . + whole subtree ...
. . . . . . . Downloading [200 sqs] ...
. . . . . . . . [1-100]
. . . . . . . . [101-200]
. . . . . . . Working on child [id 57175]
. . . . . . . + whole subtree ...
. . . . . . . . Downloading [100 sqs] ...
. . . . . . . . . [1-100]
. . . . . . . . Working on child [id 43147]
. . . . . . . . + whole subtree ...
. . . . . . . . . Downloading [900 sqs] ...
. . . . . . . . . . [1-100]
. . . . . . . . . . [101-200]
. . . . . . . . . . [201-300]
. . . . . . . . . . [301-400]
. . . . . . . . . . [401-500]
. . . . . . . . . . [501-600]
. . . . . . . . . . [601-700]
. . . . . . . . . . [701-800]
. . . . . . . . . . [801-900]
. . . . . . . . . Working on child [id 37293]
. . . . . . . . . + whole subtree ...
. . . . . . . . . . Downloading [600 sqs] ...
. . . . . . . . . . . [1-100]
. . . . . . . . . . . [101-200]
. . . . . . . . . . . [201-300]
. . . . . . . . . . . [301-400]
. . . . . . . . . . . [401-500]
. . . . . . . . . . . [501-600]
. . . . . . . . . . Working on child [id 30591]
. . . . . . . . . . + whole subtree ...
. . . . . . . . . . . Downloading [1000 sqs] ...
. . . . . . . . . . . . [1-100]
. . . . . . . . . . . . [101-200]
. . . . . . . . . . . . [201-300]
. . . . . . . . . . . . [301-400]
. . . . . . . . . . . . [401-500]
. . . . . . . . . . . . [501-600]
. . . . . . . . . . . . [601-700]
. . . . . . . . . . . . [701-800]
. . . . . . . . . . . . [801-900]
. . . . . . . . . . . . [901-1000]
. . . . . . . . . . . Working on child [id 9505]
. . . . . . . . . . . + whole subtree ...
. . . . . . . . . . . . Downloading [300 sqs] ...
. . . . . . . . . . . . . [1-100]
. . . . . . . . . . . . . [101-200]
. . . . . . . . . . . . . [201-300]
Successfully retrieved [3238 sqs] in total.
-----------------------------------------------
Completed stage DOWNLOAD: [2023-01-03 05:12:04]
-----------------------------------------------
---------------------------------------------
Starting stage CLUSTER: [2023-01-03 05:12:04]
---------------------------------------------
Working from [id 9504] down hierarchy
. + direct [id 9504 (genus)]
. + subtree [id 9504 (genus)]
. . BLASTing [3238 sqs] ....
. . Running makeblastdb
. . Running blastn
. . Removed [86407/181398] hits due to insufficient coverage
. . Identified [794] clusters
. + parent [id 9504]
. . + direct [id 2688256 (no rank)]
. . + subtree [id 2688256 (no rank)]
. . . BLASTing [34 sqs] ....
. . . Removed [0/417] hits due to insufficient coverage
. . . Identified [15] clusters
. . + parent [id 2688256]
. . . + subtree [id 1230482 (species)]
. . + parent [id 2688256]
. . . + subtree [id 1090913 (species)]
. . + parent [id 2688256]
. . . + subtree [id 1002694 (species)]
. . + parent [id 2688256]
. . . + subtree [id 940829 (species)]
. . + parent [id 2688256]
. . . + subtree [id 413234 (species)]
. . + parent [id 2688256]
. . . + subtree [id 261316 (species)]
. . + parent [id 2688256]
. . . + subtree [id 231953 (species)]
. . + parent [id 2688256]
. . . + subtree [id 222417 (species)]
. + parent [id 9504]
. . + subtree [id 1263727 (species)]
. + parent [id 9504]
. . + subtree [id 361674 (species)]
. + parent [id 9504]
. . + subtree [id 292213 (species)]
. + parent [id 9504]
. . + subtree [id 57176 (species)]
. + parent [id 9504]
. . + subtree [id 57175 (species)]
. + parent [id 9504]
. . + subtree [id 43147 (species)]
. + parent [id 9504]
. . + subtree [id 37293 (species)]
. + parent [id 9504]
. . + direct [id 30591 (species)]
. . . BLASTing [565 sqs] ....
. . . Removed [36203/51582] hits due to insufficient coverage
. . . Identified [264] clusters
. . + subtree [id 30591 (species)]
. . . BLASTing [435 sqs] ....
. . . Removed [911/7414] hits due to insufficient coverage
. . . Identified [71] clusters
. . + parent [id 30591]
. . . + subtree [id 867331 (subspecies)]
. . + parent [id 30591]
. . . + subtree [id 280755 (subspecies)]
. . + parent [id 30591]
. . . + subtree [id 120088 (subspecies)]
. + parent [id 9504]
. . + subtree [id 9505 (species)]
[1/1]
----------------------------------------------
Completed stage CLUSTER: [2023-01-03 05:12:23]
----------------------------------------------
-----------------------------------------------
Starting stage CLUSTER^2: [2023-01-03 05:12:23]
-----------------------------------------------
Loading clusters ...
Done. Only one cluster set -- skipping cluster^2
Dropping all clusters of < 3 sqs ...
Renumbering clusters ...
Saving ...
------------------------------------------------
Completed stage CLUSTER^2: [2023-01-03 05:12:24]
------------------------------------------------
-------------------------------------------
Completed pipeline at [2023-01-03 05:12:24]
-------------------------------------------

r$> all_clusters <- read_phylota(wd)

r$> print(all_clusters)
[1] "Phylota Table (Aotus)\n- [184] clusters\n- [1945] sequences\n- [13] source taxa\n"

r$> all_clusters
Phylota Table (Aotus)
- [184] clusters
- [1945] sequences
- [13] source taxa

r$> cids <- all_clusters@cids

r$> cids
  [1] "0"   "1"   "2"   "3"   "4"   "5"   "6"   "7"   "8"  
 [10] "9"   "10"  "11"  "12"  "13"  "14"  "15"  "16"  "17" 
 [19] "18"  "19"  "20"  "21"  "22"  "23"  "24"  "25"  "26" 
 [28] "27"  "28"  "29"  "30"  "31"  "32"  "33"  "34"  "35" 
 [37] "36"  "37"  "38"  "39"  "40"  "41"  "42"  "43"  "44" 
 [46] "45"  "46"  "47"  "48"  "49"  "50"  "51"  "52"  "53" 
 [55] "54"  "55"  "56"  "57"  "58"  "59"  "60"  "61"  "62" 
 [64] "63"  "64"  "65"  "66"  "67"  "68"  "69"  "70"  "71" 
 [73] "72"  "73"  "74"  "75"  "76"  "77"  "78"  "79"  "80" 
 [82] "81"  "82"  "83"  "84"  "85"  "86"  "87"  "88"  "89" 
 [91] "90"  "91"  "92"  "93"  "94"  "95"  "96"  "97"  "98" 
[100] "99"  "100" "101" "102" "103" "104" "105" "106" "107"
[109] "108" "109" "110" "111" "112" "113" "114" "115" "116"
[118] "117" "118" "119" "120" "121" "122" "123" "124" "125"
[127] "126" "127" "128" "129" "130" "131" "132" "133" "134"
[136] "135" "136" "137" "138" "139" "140" "141" "142" "143"
[145] "144" "145" "146" "147" "148" "149" "150" "151" "152"
[154] "153" "154" "155" "156" "157" "158" "159" "160" "161"
[163] "162" "163" "164" "165" "166" "167" "168" "169" "170"
[172] "171" "172" "173" "174" "175" "176" "177" "178" "179"
[181] "180" "181" "182" "183"

r$> n_taxa <- get_ntaxa(phylota = all_clusters, cid = cids)

r$> n_taxa
  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14 
  5  10   5   2   2   9   9   9   9   9   8   9   9   9   4 
 15  16  17  18  19  20  21  22  23  24  25  26  27  28  29 
  1   9   8   8   8   8   2   1   1   1   1   4   2   2   1 
 30  31  32  33  34  35  36  37  38  39  40  41  42  43  44 
  1   1   2   2   2   4   2   1   1   1   1   2   2   2   2 
 45  46  47  48  49  50  51  52  53  54  55  56  57  58  59 
  2   2   2   2   2   2   3   2   3   1   2   4   3   1   1 
 60  61  62  63  64  65  66  67  68  69  70  71  72  73  74 
  1   1   1   4   2   1   1   6   2   5   1   1   4   4   5 
 75  76  77  78  79  80  81  82  83  84  85  86  87  88  89 
  1   1   1   5   1   1   1   3   5   5   5   5   5   5   5 
 90  91  92  93  94  95  96  97  98  99 100 101 102 103 104 
  5   5   5   5   5   5   5   5   5   5   5   5   5   5   5 
105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 
  5   5   5   5   5   3   4   4   4   4   4   4   4   4   4 
120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 
  4   4   4   4   4   4   4   4   4   4   4   3   3   4   4 
135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 
  1   4   4   1   1   1   2   2   3   3   3   3   3   3   3 
150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 
  3   3   3   3   3   3   3   3   3   3   3   3   3   3   3 
165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 
  3   3   3   3   3   3   1   3   1   1   3   1   1   1   1 
180 181 182 183 
  1   1   2   2 

r$> keep <- cids[taxa_all > 10]
Error: object 'taxa_all' not found

r$> n_taxa <- get_ntaxa(phylota = all_clusters, cid = cids)

r$> n_taxa
  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14 
  5  10   5   2   2   9   9   9   9   9   8   9   9   9   4 
 15  16  17  18  19  20  21  22  23  24  25  26  27  28  29 
  1   9   8   8   8   8   2   1   1   1   1   4   2   2   1 
 30  31  32  33  34  35  36  37  38  39  40  41  42  43  44 
  1   1   2   2   2   4   2   1   1   1   1   2   2   2   2 
 45  46  47  48  49  50  51  52  53  54  55  56  57  58  59 
  2   2   2   2   2   2   3   2   3   1   2   4   3   1   1 
 60  61  62  63  64  65  66  67  68  69  70  71  72  73  74 
  1   1   1   4   2   1   1   6   2   5   1   1   4   4   5 
 75  76  77  78  79  80  81  82  83  84  85  86  87  88  89 
  1   1   1   5   1   1   1   3   5   5   5   5   5   5   5 
 90  91  92  93  94  95  96  97  98  99 100 101 102 103 104 
  5   5   5   5   5   5   5   5   5   5   5   5   5   5   5 
105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 
  5   5   5   5   5   3   4   4   4   4   4   4   4   4   4 
120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 
  4   4   4   4   4   4   4   4   4   4   4   3   3   4   4 
135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 
  1   4   4   1   1   1   2   2   3   3   3   3   3   3   3 
150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 
  3   3   3   3   3   3   3   3   3   3   3   3   3   3   3 
165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 
  3   3   3   3   3   3   1   3   1   1   3   1   1   1   1 
180 181 182 183 
  1   1   2   2 

r$> cids
  [1] "0"   "1"   "2"   "3"   "4"   "5"   "6"   "7"   "8"  
 [10] "9"   "10"  "11"  "12"  "13"  "14"  "15"  "16"  "17" 
 [19] "18"  "19"  "20"  "21"  "22"  "23"  "24"  "25"  "26" 
 [28] "27"  "28"  "29"  "30"  "31"  "32"  "33"  "34"  "35" 
 [37] "36"  "37"  "38"  "39"  "40"  "41"  "42"  "43"  "44" 
 [46] "45"  "46"  "47"  "48"  "49"  "50"  "51"  "52"  "53" 
 [55] "54"  "55"  "56"  "57"  "58"  "59"  "60"  "61"  "62" 
 [64] "63"  "64"  "65"  "66"  "67"  "68"  "69"  "70"  "71" 
 [73] "72"  "73"  "74"  "75"  "76"  "77"  "78"  "79"  "80" 
 [82] "81"  "82"  "83"  "84"  "85"  "86"  "87"  "88"  "89" 
 [91] "90"  "91"  "92"  "93"  "94"  "95"  "96"  "97"  "98" 
[100] "99"  "100" "101" "102" "103" "104" "105" "106" "107"
[109] "108" "109" "110" "111" "112" "113" "114" "115" "116"
[118] "117" "118" "119" "120" "121" "122" "123" "124" "125"
[127] "126" "127" "128" "129" "130" "131" "132" "133" "134"
[136] "135" "136" "137" "138" "139" "140" "141" "142" "143"
[145] "144" "145" "146" "147" "148" "149" "150" "151" "152"
[154] "153" "154" "155" "156" "157" "158" "159" "160" "161"
[163] "162" "163" "164" "165" "166" "167" "168" "169" "170"
[172] "171" "172" "173" "174" "175" "176" "177" "178" "179"
[181] "180" "181" "182" "183"

r$> keep <- cids[n_taxa > 10]

r$> selected <- drop_clstrs(phylota = all_clusters, cid = keep
    )

r$> reduced <- drop_by_rank(selected, rnk = 'species', n=1)

r$> reduced
Phylota Table (Aotus)
- [0] clusters
- [0] sequences
- [0] source taxa

r$> keep <- cids[n_taxa >= 4]

r$> keep
 [1] "0"   "1"   "2"   "5"   "6"   "7"   "8"   "9"   "10"  "11"  "12" 
[12] "13"  "14"  "16"  "17"  "18"  "19"  "20"  "26"  "35"  "56"  "63" 
[23] "67"  "69"  "72"  "73"  "74"  "78"  "83"  "84"  "85"  "86"  "87" 
[34] "88"  "89"  "90"  "91"  "92"  "93"  "94"  "95"  "96"  "97"  "98" 
[45] "99"  "100" "101" "102" "103" "104" "105" "106" "107" "108" "109"
[56] "111" "112" "113" "114" "115" "116" "117" "118" "119" "120" "121"
[67] "122" "123" "124" "125" "126" "127" "128" "129" "130" "133" "134"
[78] "136" "137"

r$> keep <- cids[n_taxa >= 5]

r$> selected <- drop_clstrs(phylota = all_clusters, cid = keep)

r$> selected
Phylota Table (Aotus)
- [48] clusters
- [1073] sequences
- [13] source taxa

r$> reduced <- drop_by_rank(selected, rnk = 'species', n=1)
bheimbu commented 1 year ago

Hi @ShixiangWang,

the zip file is too large for uploading (77 MB), please download it from here.

I've also tried your example above, but without success, still getting the same error Error in phylota@sqs@sqs[[i]] : attempt to select less than one element in get1index.

Cheers Bastian

ShixiangWang commented 1 year ago

@bheimbu You can send it to shixiang1994wang@gmail.com.

It's a little strange, did you set the id to 9504? My whole directory is only 16 MB. If you also set 9504 but the working directory is 77MB, please retry with another new working directory. If the error occurs, send the zipped directory via the email above.

@ShixiangWang ➜ /workspaces $ du -sh test2
16M     test2
@ShixiangWang ➜ /workspaces $ ls test2
blast  blast_versions.txt  cache  log.txt  session_info.txt

Best,

Shixiang

bheimbu commented 1 year ago

Hi,

no sorry I'm using taxid 1912919 (Termitoidae (termites)).

Cheers Bastian

ShixiangWang commented 1 year ago

@bheimbu Thanks, I got it.

bheimbu commented 1 year ago

Hi @ShixiangWang,

do you have any news or solution?

Cheers Bastian

ShixiangWang commented 1 year ago

@bheimbu I can reproduce the error in my test environment. It would take some time to explore the error and solution. Thanks for your example data. I will reply when I have any updates.

ShixiangWang commented 1 year ago

@bheimbu Please install the latest version from GitHub and retry again, the specific error should be fixed

remotes::install_github('ropensci/phylotaR')
bheimbu commented 1 year ago

@ShixiangWang it works, many thx.

ShixiangWang commented 1 year ago

Great, happy to hear that.