Error (error: In path.expand(path)) from spThin::thin(..., write.files = TRUE) resulting from long file names where a dataset with where "spec.col" has many levels #29
I experienced a challenge saving files from thin( ) which I describe below and share the modification I made on the "thin.R" script changing how the files are named to both:
have unique file names
avoid file names getting too long
Current state
PROBLEM: saving thinned data by setting the option write.files = TRUE in “spThin::thin( …)
the thinning function “spThin::thin( … )” has the option to save each species’ thinned dataset as it is generated. However, from the source code (at https://github.com/mlammens/spThin/blob/master/R/thin.R), the way the file names are created results in subsequent “csv file names” to keep increasing in length i.e
If first csv is saved “new.csv”, the second is saved as “new_new.csv”, the 3rd as “new_new_new.csv” etc (line 185 in the source code). The problem with this is that for datasets with very many levels under "spec.col", the file names used become too long as the number of species thinned increase, causing (error: In path.expand(path))
This naming system is used to prevent overwriting since the “base” naming system used (“in line 170”) has no unique identifier for the species and may therefore result in different csv files having the same name. At “line 185”, the names are modified increasing “_new” to each subsequent thinned dataset.
SOLUTION PROPOSED
Modify the “thin()” function’s source code by : “changing how the files names such that the name of the “species” is included in the file name. i.e
At “line 170”, add species name to the thinned output file, i.e Replacing:
Hi,
I experienced a challenge saving files from thin( ) which I describe below and share the modification I made on the "thin.R" script changing how the files are named to both:
have unique file names
avoid file names getting too long
Current state
PROBLEM: saving thinned data by setting the option write.files = TRUE in “spThin::thin( …)
the thinning function “spThin::thin( … )” has the option to save each species’ thinned dataset as it is generated. However, from the source code (at https://github.com/mlammens/spThin/blob/master/R/thin.R), the way the file names are created results in subsequent “csv file names” to keep increasing in length i.e
If first csv is saved “new.csv”, the second is saved as “new_new.csv”, the 3rd as “new_new_new.csv” etc (line 185 in the source code). The problem with this is that for datasets with very many levels under "spec.col", the file names used become too long as the number of species thinned increase, causing (error: In path.expand(path))
This naming system is used to prevent overwriting since the “base” naming system used (“in line 170”) has no unique identifier for the species and may therefore result in different csv files having the same name. At “line 185”, the names are modified increasing “_new” to each subsequent thinned dataset.
SOLUTION PROPOSED
Modify the “thin()” function’s source code by : “changing how the files names such that the name of the “species” is included in the file name. i.e
At “line 170”, add species name to the thinned output file, i.e Replacing:
csv.files <- paste( out.dir, out.base, "_thin", rep(1:n.csv), ".csv", sep="")
With:
csv.files <- paste( out.dir, out.base, "thin", gsub(" ", "", as.character(species)), rep(1:n.csv), ".csv", sep="")
This will ensure every file name is unique and line 185 which adds the “_new” to every subsequent file name will be unnecessary and can be removed.
RESULTS:
Regards