Closed htomelka closed 2 years ago
Hi,
The warning flags that "show_col_types = FALSE" is an unused argument. Please check and update your R packages. readr should be v 2.0.0 or later.
Cheers,
Simon
@leightonpayne should we add the dependency checking script as an optional argument for padloc?
Hi !
For other things, I was loading a version of r that was messing with padloc, problem solved !
Thanks you !
One issue solved, another appear...
I've run padloc with several file, with this command : padloc --faa XXXX --gff XXXX --outdir XXXX --debug --cpu 8
All the file are generated the same way, but with some, I've got this error :
[10:19:13] DEBUG >> Reading 1397.gff
Error: Problem with `mutate()` column `ID`.
i `ID = ifelse(is.na(pseudo), ID, Name)`.
x object 'Name' not found
Backtrace:
x
1. +-gff %>% mutate(ID = ifelse(is.na(pseudo), ID, Name))
2. +-dplyr::mutate(., ID = ifelse(is.na(pseudo), ID, Name))
3. +-dplyr:::mutate.data.frame(., ID = ifelse(is.na(pseudo), ID, Name))
4. | \-dplyr:::mutate_cols(.data, ..., caller_env = caller_env())
5. | +-base::withCallingHandlers(...)
6. | \-mask$eval_all_mutate(quo)
7. +-base::ifelse(is.na(pseudo), ID, Name)
8. \-base::.handleSimpleError(...)
9. \-dplyr:::h(simpleError(msg, call))
Execution halted
[10:19:14] ERROR >> errexit on line 397
dplyr version is 1.0.7, and given that the issue does not appear for all my files, I can't understand where the problem is...
I joinded files which have the issues, if you see the solution, Thanks!
Seems to be an edge case related to how we deal with pseudogenes in the PGAP-formatted RefSeq files. If the [pseudo] field is present in the gff attributes (e.g. "pseudo="), we take [Name] as the [ID]. In your case, [pseudo] is present without [Name] (hence the somewhat obscure warning 'Name' not found). I'm not familiar with the "MicroScope annotation platform" used to generate this gff, so the easiest solution I found was to remove all ";pseudo=None" in the supplied .gff and that solved the issue. I've made a note to add more informative error reporting for cases like like.
Cheers,
Simon
Thanks you for your help, everythings works fine now !
Hi ! I've tried to run padloc with .faa and .gff files and I had an error. Thinking the issue was my files, I've tried with test data and get the same error :
padloc --faa GCF_001688665.2.faa --gff GCF_001688665.2.gff --cpu 4
[09:35:59] >> Scanning GCF_001688665.2 for defence system proteins [09:37:40] >> Searching GCF_001688665.2 for defence systems Error in read_tsv(., col_names = c("temp", "target.description"), comment = "#", : unused argument (show_col_types = FALSE) Calls: read_domtbl ... type_convert -> stopifnot -> is.data.frame -> separate Execution halted [09:37:47] ERROR >> errexit on line 397
It seems to be the same issues than #16
If you have a solution, Thanks!